SlideShare a Scribd company logo
1 of 38
13-11-2020 1Dr. K.K. THYAGHARAJAN
Contact E-mail: acdean@rmd.ac.in kkthyagharajan@yahoo.com kkthyagharajan@gmail.com
Dr. K.K. THYAGHARAJAN
Professor & Dean (Academic)
Department of Electronics and Communication Engineering
RMD ENGINEERING COLLEGE
PROCESSOR ORGANIZATION
Click on the links given below to view videos.
UNIT – I Computer Architecture
PPT-PDF http://dx.doi.org/10.13140/RG.2.2.28687.20643
https://youtu.be/DcMM_dIxWEE
https://youtu.be/JoSONsTuopk
UNIT – II Computer Architecture
PPT-PDF http://dx.doi.org/10.13140/RG.2.2.36236.95363
https://youtu.be/thC8B4B-PyY
https://youtu.be/m7JtcP5QmFA
https://youtu.be/NbfTKSm4ubM
https://youtu.be/RhiBtztCESI
UNIT – IV Computer Architecture & Microprocessors
PPT-PDF http://dx.doi.org/10.13140/RG.2.2.20718.02880
https://youtu.be/LroA8T-_vqs
https://youtu.be/CU1wx8EZmvc
https://youtu.be/zYADaZ5sfY0
https://youtu.be/GuC7sZEw-uM
13-11-2020 2Dr. K.K. THYAGHARAJAN
Dr. K.K. THYAGHARAJAN
Contact E-mail: acdean@rmd.ac.in kkthyagharajan@yahoo.com kkthyagharajan@gmail.com
PROCESSOR ORGANIZATION
FLYNN’S CLASSIFICATION OF MULTIPROCESSOR ORGANIZATION
This video explains SISD, SIMD, MISD and MIMD classification
13-11-2020 3Dr. K.K. THYAGHARAJAN
Flynn’s Classification of Processor Organization
1. SISD (Single Instruction stream Single Data stream)
2. SIMD (Single Instruction stream Multiple Data stream)
3. MISD (Multiple Instruction stream Single Data stream)
4. MIMD (Multiple Instruction stream Multiple Data stream)
The processor organization deals with how the parts such as
control units and processing elements (ALU) of the processor
are linked together in a multiprocessor to improve the
performance.
13-11-2020 4Dr. K.K. THYAGHARAJAN
Flynn’s Classification of Processor Organization
1. SISD
2. SIMD
Control
Unit
ALU
Main
Memory
IS DS
IS
 SISD has a single control unit and gets a single
instruction from the main memory at a time.
 It has one processing Element (ALU) and uses one
data stream connected with the main memory.
 Processing unit may have more functional units
(add, multiply, load etc.)
 Instruction stream (Is)= Data stream (Ds) = 1
 SIMD is more suitable for handling arrays in the for loops and
data parallelism is achieved
 SIMD gets only one copy of the code from the main memory
and this operation is performed on multiple independent
data obtained by the main memory as shown in the figure.
 Reduces the instruction bandwidth and space
 Not suitable for the case … switch statements because the
execution unit (Processing unit) must perform different
operations based on data
 The processor should complete the current instruction
before it takes the next instruction i.e. execution of the
instruction is synchronous
PEn
PE2Control
Unit
PE1
MM- Main
Memory
IS
DS1
IS
DS2
DSn
…
MM1
MM2
MMn
IS
IS
IS
…
PE –
Processing
Element
13-11-2020 5Dr. K.K. THYAGHARAJAN
Flynn’s Classification of Processor Organization
3. MISD
PEn
PE2
PE1
MM- Main
Memory
…
IS1
IS2
ISn
PE – Processing Element CU – Control Unit
Is >1 and Ds = 1
 Multiple control units are used to control
multiple processing units
 Each control unit is handling one instruction and
process it through its corresponding process
element
 Only one data stream is passing through all
processing elements at a time from a common
shared memory
CUn
CU2
CU1
…
DS
DS
ISn
IS2
IS1
13-11-2020 6Dr. K.K. THYAGHARAJAN
Flynn’s Classification of Processor Organization
4. MIMD
PEn
PE2
PE1
…
IS1
IS2
ISn
PE – Processing Element CU – Control Unit
Is >1 and Ds >1  Multiple control units are used to handle multiple
instructions at same time
 Multiple processing elements are used with separate
data stream drawn from main memory for each
processing element
 Each process works on its own instruction and own data
 Task executed by different processes are asynchronous
i.e. each task can start or finish at different times
 This organization actually represents a real parallel
computer - Example Graphics Processing Unit (GPU)
CUn
CU2
CU1
…
DS1
ISn
IS2
IS1
MM1
MM2
MMn
…
Mainmemory
DS2
DSn
13-11-2020 7Dr. K.K. THYAGHARAJAN
 SIMD is a vector architecture and it is used for data level parallelism.
 In vector architecture data are collected from memory, put them in proper order into a set of registers. These registers are operated
sequentially using pipelined execution units. The results are written back to memory.
 If two vectors A and B are to be added and the result is to be stored in the vector C, then it can be written as C = A + B . A(1), A(2) etc
are vector elements
 Figure 1 shows a single ‘add’ pipeline to add vectors A and B and stores the result in vector C. Here one addition is performed per cycle
Figure 2
SIMD (Single Instruction stream Multiple Data stream)
Figure 1 Figure 3
 Figure 2 uses four add pipelines or lanes. So,
it completes four additions per cycle.
 The number of clocks required to execute a
vector addition is reduced by a factor of 4.
 Each vector lane uses a portion of vector
register.
Figure 3 uses four lanes. Each
vector lane has more than
one functional units.
Three functional units FP add,
FP Multiply, and a load store
unit are provided
The elements in a single
vector are interleaved across
four lanes,
Each lane uses a portion of
vector register.
The vector storage is divided
across four lanes and each
lane holds every fourth
element of each vector
register
13-11-2020 8Dr. K.K. THYAGHARAJAN
SIMD (Single Instruction stream Multiple Data stream)
 Old array processors use 64 ALUs for doing 64 additions simultaneously
 But SIMD vector architecture use less number of ALUs (even one) and passes the data through lanes and
pipelines. This reduces the hardware cost.
 In MIPS vector architecture, 32 vector registers are provided and each register will point 64 vector elements
each of 64 bit size.
 Hardware gets the addresses of the vector from these vector vector-registers. This indexed accesses are called
as gather scatter.
 The data need not be contiguous in main memory. Indexed load instructions gather data from the main
memory and put them in contiguous vector elements.
 Indexed store instruction scatters vector elements across main memory.
 The number of elements in a vector operation is not in the instruction or opcode, it is in a separate register
 MIPS vector instructions are obtained by appending letter v to MIPS instructions for example
 addv.d # adds two double precision vectors. This instruction accesses the inputs using two vector registers
 addvs.d # This instruction takes one input from a scalar register and another input using a vector register. The
scalar input is added with each element in the vector.
 lv # load vector (double precision data)
 sv # store vector (double precision data)
13-11-2020 9Dr. K.K. THYAGHARAJAN
SIMD (Single Instruction stream Multiple Data stream)
Problem:
Write an MIPS program using vector instructions to solve Y = a*X + Y
Where X and Y are vectors (arrays) with 64 double precision floating point numbers i.e. 64 numbers of 64-
bit size each stored in the memory.
Assume that the starting address of X is in $s0 and starting address of Y is in $s1
Solution:
l.d $f0, a($sp) # load scalar ‘a’ into f0 register
lv $v1, 0($s0) # load vector X pointed by register s0 into v1 register
mulvs.d $v2, $v1, $f0 # multiply vector v1 by scalar f0 and store the result in vector v2
lv $v3, 0($s1) # load vector Y in v3
addv.d $v4, $v2, $v3 # add vector Y (in v3) to the product (in v2) and store the result in vector v4
sv $v4, 0($s1) # store the result v4 in the vector Y (pointed by register s1)
13-11-2020 10Dr. K.K. THYAGHARAJAN
Contact E-mail: acdean@rmd.ac.in kkthyagharajan@yahoo.com kkthyagharajan@gmail.com
Dr. K.K. THYAGHARAJAN
Professor & Dean (Academic)
Department of Electronics and Communication Engineering
RMD ENGINEERING COLLEGE
Multithreading
13-11-2020 11Dr. K.K. THYAGHARAJAN
Dr. K.K. THYAGHARAJAN
Contact E-mail: acdean@rmd.ac.in kkthyagharajan@yahoo.com kkthyagharajan@gmail.com
Multithreading
Hardware Multithreading
This video explains three types of hardware multithreading
13-11-2020 12Dr. K.K. THYAGHARAJAN
1. Hardware Multithreading
Thread: Thread is a sequence of instructions that can run independently from other programs.
When a sequence of instructions are being executed, the processor may have to wait if the next instruction
or data is not available. This is called stalling.
Instead of waiting, the processor may switch to another thread , execute it and come back to this thread.
Multithreading: Switching from one thread (stalled thread) to another thread is known as multithreading.
All the threads generally share a single address space using a program counter, and stack and register states.
Process: A process includes one or more threads and their address spaces. Switching from one process to another process
invokes the operating system (OS).
The main difference between multithreading and the process: Multithreading uses single address space and it does not
invoke the OS. But, a process switches from threads at different address spaces and requires the help of the operating
system to do this switching. So, you can say that multithreading is a lightweight process and it is smaller than the process.
Types of multithreading:
Fine-grained multithreading
Coarse-grained multithreading
Simultaneous multithreading
13-11-2020 13Dr. K.K. THYAGHARAJAN
1.1 Fine-grained multithreading
 Switching between threads happens on each instruction
 Switching happens on every clock cycle
 Switching is done in a round-robin fashion as shown in figure 1 i.e. 1st instruction from the 1st thread is executed then 1st
instruction from the 2nd thread is executed and it continues until the 1st instruction in the last thread completes. Then the
2nd instruction from the 1st thread is executed, 2nd instruction from the 2nd thread is executed and so on.
Advantage: If any thread is stalled, it is skipped and the next thread continues . This is called interleaving and this approach
improves the throughput.
Disadvantage: Threads that are ready to execute the next instruction without stalls should wait until other threads are over. This
slows down the execution of individual threads.
1. Hardware Multithreading
1.2. Coarse-grained multithreading
Threads are switched only when costly stalls (i.e. last-level cache miss) occur. So, frequent thread switching is avoided and hence slow
down of individual thread execution is avoided.
Disadvantage: The new thread starts execution only after the pipeline is filled up and the current instructions completes its execution.
This is called start-up overhead.
Advantage: The processor issues instructions from the same thread when shorter stalls occur. So pipeline may be required to be
emptied or frozen. However, the time taken in this case is less compared to the pipeline start-up time and hence the throughput cost is
minimized.
This approach reduces the penalty of high-cost stalls because the pipeline refill time is negligible compared to stall time.
13-11-2020 14Dr. K.K. THYAGHARAJAN
1.3. Simultaneous multithreading (SMT)
 Multiple instructions are executed from independent multiple threads using
register renaming. If the threads are dependent, the dependencies are handled
by dynamic scheduling. But it does not switch resources every clock cycle.
 SMT processors have dynamically scheduled pipelines for thread level
parallelism and instruction level parallelism. These processors have more
functional units to implement parallelism.
 Intel call this as hyper-threading and AMD calls this as SMT
1. Hardware Multithreading
Figure 1 shows three threads that will execute independently with stalling.
The empty rows indicate unused clock cycle or stall.
One row in each thread is issued to pipe line in each clock cycle
Figure 2a shows how the three threads shown in figure 1 are executed when fine grained multithreading is applied
Figure 2b shows how the three threads shown in figure 1 are executed when coarse grained multithreading is applied
Figure 2c shows how the three threads shown in figure 1 are executed when simultaneous multithreading is applied
13-11-2020 15Dr. K.K. THYAGHARAJAN
Thread A Thread B Thread C
Costly Stall
Costly Stall
Costly Stall
Costly Stall
Costly Stall
Costly Stall Costly Stall
Time
(2a) Fine Grained MT (2b) Coarse Grained MT (2c) Simultaneous MT
Time
Figure 1: Independent threads
Figure 2: The above three threads
executed with Multithreading
13-11-2020 16Dr. K.K. THYAGHARAJAN
Contact E-mail: acdean@rmd.ac.in kkthyagharajan@yahoo.com kkthyagharajan@gmail.com
Dr. K.K. THYAGHARAJAN
Professor & Dean (Academic)
Department of Electronics and Communication Engineering
RMD ENGINEERING COLLEGE
Multiprocessing
13-11-2020 17Dr. K.K. THYAGHARAJAN
Dr. K.K. THYAGHARAJAN
Contact E-mail: acdean@rmd.ac.in kkthyagharajan@yahoo.com kkthyagharajan@gmail.com
Multiprocessing Systems
Multicore processor & Multiprocessor
KKT
13-11-2020 18Dr. K.K. THYAGHARAJAN
1. Multiprocessing Systems
1.Multiprocessor
system
1.1 Shared memory System
(Tightly coupled system)
1.1.1 Uniform
memory Access
(UMA)
1.1.2 Non-uniform
memory Access
(NUMA)
1.2 Distributed memory System
(Loosely Coupled System)
Cluster
Two Types of Multiprocessing:
 More than one processors are in a single chip - Multicore processor – Quad core
processor can handle 4 threads, Octa-core can handle 8 threads – more cores better
multiprocessing- uses on-chip network to connect the processors
 More than one processors are connected in a single system - Multiprocessor system -
uses Local Area Network (LAN) to connect systems
13-11-2020 19Dr. K.K. THYAGHARAJAN
1.1 Shared Memory (tightly coupled) System
• All the processors share a single global memory. The global memory
may be divided into many modules, but single address space is used.
(e.g. Multicore processor)
• Processors communicate using shared locations (variables) in the global
memory.
• These shared data are coordinated using locks (synchronization
primitives) which allow data to be accessed by only one processor at a
time.
• Shared memory system use a common bus or a cross bar or a
multistage network to connect processors, memory and I/O devices
• Programs stored in the virtual address space of each processor can run
independently.
• This is used for high speed real-time processing and provides high
throughput compared to loosely coupled systems.
13-11-2020 20Dr. K.K. THYAGHARAJAN
1.1.1 Uniform memory Access (UMA) System
UMA systems are divided in to two
1. Symmetric UMA (SUMA)
2. Asymmetric UMA (AUMA)
In the case of SUMA all processors are identical. Processors may have local cache memories and I/O devices. Physical
memory is uniformly shared by all processors with equal access time to all words.
In the case of AUMA, one master processor executes the operating system and other processors may be dedicated to
special tasks such as graphics rendering, doing mathematical functions etc.
Processor 1
Cache
Processor 2
Cache
Processor n
Cache
Interconnection Network
Memory I/O
13-11-2020 21Dr. K.K. THYAGHARAJAN
1.1.2 Non-uniform Memory Access (NUMA) System
In NUMA system each processor may have a local memory. Local memories will have its
own private program and private data. The collection of all local memories form the global
memory i.e. local memory of one processor may be accessed by another processor using
shared variables.
The time taken for accessing the
local memory of one processor by
another remote processor is not
uniform. It depends on the location
of the processor and the memory.
NUMAs can scale to larger size with
lower latency (access time) to local
memory.
Interrupt Signal Interconnection Network
Processor-Memory Interconnection Network
P1 P2 Pn
M1 M2 Mn
I/O processor
Interconnection
Network
D1 D2 Dn
I/O channels
13-11-2020 22Dr. K.K. THYAGHARAJAN
1.2 Distributed Memory System (DMS)
(Loosely Coupled System)
• DMS systems do not use a global shared memory.
• Use of global memory creates memory conflicts and slows down the execution.
• DMS has multiple processors and each processor has a large local memory and a set of I/O devices,
which are not shared by any other processor. So, this system is called distributed multicomputer system.
• The group of computers connected together is called a cluster and each computer is called a node.
• These computers communicate with each other by passing messages through an interconnection
network.
• To pass messages to other computers in the cluster ‘send message’ routine is used
• To receive messages from other computers in the cluster ‘receive message’ routine is used
13-11-2020 23Dr. K.K. THYAGHARAJAN
1.2 Distributed Memory System
(Loosely Coupled System)
LM1 P1
Node
LM2 P2
Node
LMn Pn
Node
Message
Passing
Inter-
Connection
Network
LM – Local Memory
P - Processor
13-11-2020 24Dr. K.K. THYAGHARAJAN
Contact E-mail: acdean@rmd.ac.in kkthyagharajan@yahoo.com kkthyagharajan@gmail.com
Dr. K.K. THYAGHARAJAN
Professor & Dean (Academic)
Department of Electronics and Communication Engineering
RMD ENGINEERING COLLEGE
Multiprocessor Network Topologies
13-11-2020 25Dr. K.K. THYAGHARAJAN
Dr. K.K. THYAGHARAJAN
Contact E-mail: acdean@rmd.ac.in kkthyagharajan@yahoo.com kkthyagharajan@gmail.com
Multiprocessor Network Topologies
Multiprocessor Network Topologies
KKT
13-11-2020 26Dr. K.K. THYAGHARAJAN
Bus Topology
o Showing how the nodes (processor & memory) are connected is called the network topology.
o Multicore chips require on-chip networks to connect cores together
o Clusters require local area networks to connect servers together
o Networks are drawn as graphs and edges of the graph represents links of the communication network
o Nodes (computers or processor-memory-nodes) are connected to this graph through network switches .
o In the following diagrams coloured circles represent switches and black squares represent processor-memory-nodes
o Network costs include the number of switches, number of links on a switch to connect, the length of the links, the width (number
of bits) per link
1. Uses a shared set of wires that allows broadcasting messages to all
nodes at the same time
2. Bandwidth of the network = bandwidth of the bus
Multiprocessor Network Topologies
Ring Topology
1. Messages will have to travel along the intermediate nodes until they arrive at
the final destination. A ring is capable of many simultaneous transfers
2. Bandwidth of the network = Bandwidth of each link x number of links
3. Ring is a fully connected network. Every processor (P) has a bidirectional link to
every other processor
4. If a link is as fast as the bus, then a ring is P times faster than bus in the best-
case.
5. The total bandwidth = P x (P-1)/2
6. Bisection bandwidth = (P/2)2 Bisection bandwidth is calculated by dividing the
machines into two halves
switch
node
link
13-11-2020 27Dr. K.K. THYAGHARAJAN
• Instead of placing a processor at every node in a network, a switch is
placed at some of these nodes.
• Switches are smaller than processor-memory-switch nodes
In Star topology all nodes are connected to a
central device called hub using a point-to-point
connection
Multiprocessor Network Topologies
STAR TOPLOGY
13-11-2020 28Dr. K.K. THYAGHARAJAN
Multiprocessor Network Topologies
Boolean Cube tree network
n=3 for cube
So n (here n=3) links per switch are used in these networks
23 = 8 i.e. 2n nodes are connected.
One link goes to the processor
Switch
Node
2D Grid or Mesh network
Here n=2
So n (here n=2) links per switch
are used in these networks
One link goes to the processor
13-11-2020 29Dr. K.K. THYAGHARAJAN
Multistage networks: Messages can travel in multiple steps
1. Fully connected or crossbar networks – Any node can communicate with
any other node in one pass through the network.
2. Omega Networks: Uses less hardware than the crossbar network,
bus contention may occur between messages
Multiprocessor Network Topologies
Crossbar Network:
n= number of processors = 8
Number of switches = n2 = 64
Any node can communicate with any other
node in one pass through the network
13-11-2020 30Dr. K.K. THYAGHARAJAN
Omega Networks: Uses less hardware than the crossbar network.
There are 12 switch boxes in the network shown in the figure. Each
switch box has 4 smaller switches.
Number of switches used = 12x 4 = 48
It is given by the formula 2nlog2n , here n = number of processors = 8
So, number of switches used = 2x8x log28 = 2x8x3=48 as given above.
This network can not support all combinations for message passing,
bus contention may occur between messages.
i.e. P0 can not send messages to P6 because these two are not
connected
If P1 sends a message to P4 , then P0 may not send messages to P4 or
P5 at the same time.
Multiprocessor Network Topologies
Omega Network
Switch Box
A
B
C
D
13-11-2020 31Dr. K.K. THYAGHARAJAN
Contact E-mail: acdean@rmd.ac.in kkthyagharajan@yahoo.com kkthyagharajan@gmail.com
Dr. K.K. THYAGHARAJAN
Professor & Dean (Academic)
Department of Electronics and Communication Engineering
RMD ENGINEERING COLLEGE
Graphics Processing Unit (GPU)
13-11-2020 32Dr. K.K. THYAGHARAJAN
Dr. K.K. THYAGHARAJAN
Contact E-mail: acdean@rmd.ac.in kkthyagharajan@yahoo.com kkthyagharajan@gmail.com
Graphics Processing Unit (GPU)
Architecture of GPU (MIMD Processor)
KKT
13-11-2020 33Dr. K.K. THYAGHARAJAN
Graphics Processing Unit (GPU)
Central Processing Unit (CPU)
1. CPU has general purpose instructions and they are more
suitable for serial processing
2. CPUs will have just few cores with cache and can handle
only few threads at a time
3. CPU has a large main memory which is oriented toward
low latency
4. CPU is mainly designed for instruction level parallelism
Graphics Processing Unit (GPU)
1. GPU has its own special instructions to handle graphics
and more suitable for parallel processing.
2. GPUs have hundreds of cores and can handle thousands
of threads simultaneously
3. GPU has a separate large main memory which is oriented
toward bandwidth rather than latency and provides high
throughput
4. GPU is designed for data level parallelism
o GPU is a processor specially designed for handling graphics rendering tasks.
o GPU is used to accelerate the processing in video editing, video game
rendering, 3D modelling (AUTO CAD), AI based tasks etc.
o GPU breaks complex problems into many tasks and work on them parallelly.
o GPUs are highly multithreaded.
13-11-2020 34Dr. K.K. THYAGHARAJAN
Graphics Processing Unit (GPU)
 NVIDIA is an American company. They developed ‘Compute Unified Device Architecture’ (CUDA) for graphics
processing units and its commercial name is Fermi
 GeForce is a brand of graphics processing units designed by NVIDIA
 GPU contains a collection of multithreaded SIMD processors and hence it is a MIMD processor.
 In the Fermi architecture, The number of SIMD processors will be 7 or 11 or 14 or 15.
GPU Architecture - NVIDIA
 A CUDA program calls kernels parallelly. Click to view the diagram
 The GPU executes a kernel on a grid, a grid has many thread blocks. The thread block consists of many threads
which will be executed parallelly (figure 2).
 Each thread with in the thread block is called as machine object and it has a thread ID, program instructions, program
counter, registers, per-thread private memory, inputs and outputs.
 The machine object is created , managed, scheduled and executed by the GPU.
 The GPU has two schedulers
1. Thread Block Scheduler: Assigns blocks of threads to multithreaded SIMD processors.
2. SIMD Thread Scheduler: This is available within the SIMD processor and it has a controller. It identifies the threads
that are ready run , schedules them for executing and sends them to the dispatcher when needed.
13-11-2020 35Dr. K.K. THYAGHARAJAN
Graphics Processing Unit (GPU)
 The off-chip DRAM is shared by all thread blocks. It is called global memory or GPU memory
 The local memory of each SIMD processor is shared by its lanes within the processor, but it is not shared
between two SIMD processors
 7 or 11 or 14 or 15 SIMD processors are used in different Fermi architectures.
Figure 1: Multithreaded SIMD Processor Figure 2: GPU Memory Structure
Thread
BlockThreads
Grid 0
Grid 1
13-11-2020 36Dr. K.K. THYAGHARAJAN
Scheduling
Parallel Processing Includes
o Scheduling - a method used to share the resources for threads and processes, processor time , data lines
and balance the load to achieve the quality
o Partitioning the work into parallel pieces – task must be divided equally to all processors to avoid idle time
o Balancing the load evenly between the processors – processors should not be idle for more time
o Time required for Synchronization – processor should complete the allotted work in the specified time and
should get the next work intime otherwise parallel processing will not be possible
o Overhead for communication – Inter process communication has time overhead because it has to know which
process has to communicate and which one need not communicate.
Types of scheduling
 Long term scheduling
 Medium term scheduling
 Short term scheduling
 Dispatcher
13-11-2020 37Dr. K.K. THYAGHARAJAN
Scheduling
Problem:
To achieve a speed-up of 90 times faster with 100 processors, what percentage of the original computation can
be sequential?
Amdahl’s Law
Fraction time affected = execution time affected / execution time before
Speed-up = 90 ; amount of improvement 100
Substitute in the above formula
Fraction time affected = 0.999
So to achieve speed-up of 90 times from 100 processors, the sequential percentage can only be 0.1%
13-11-2020 38Dr. K.K. THYAGHARAJAN

More Related Content

What's hot

09 chapter04 timers_fa14
09 chapter04 timers_fa1409 chapter04 timers_fa14
09 chapter04 timers_fa14John Todora
 
Memory Reference Instructions
Memory Reference InstructionsMemory Reference Instructions
Memory Reference InstructionsRabin BK
 
Computer arithmetics (computer organisation & arithmetics) ppt
Computer arithmetics (computer organisation & arithmetics) pptComputer arithmetics (computer organisation & arithmetics) ppt
Computer arithmetics (computer organisation & arithmetics) pptSuryaKumarSahani
 
Chapter 2 instructions language of the computer
Chapter 2 instructions language of the computerChapter 2 instructions language of the computer
Chapter 2 instructions language of the computerBATMUNHMUNHZAYA
 
Pipeline and data hazard
Pipeline and data hazardPipeline and data hazard
Pipeline and data hazardWaed Shagareen
 
VTU 4TH SEM CSE MICROPROCESSORS SOLVED PAPERS OF JUNE-2014 & JUNE-2015
VTU 4TH SEM CSE MICROPROCESSORS SOLVED PAPERS OF JUNE-2014 & JUNE-2015VTU 4TH SEM CSE MICROPROCESSORS SOLVED PAPERS OF JUNE-2014 & JUNE-2015
VTU 4TH SEM CSE MICROPROCESSORS SOLVED PAPERS OF JUNE-2014 & JUNE-2015vtunotesbysree
 
COMPUTER ORGANIZATION NOTES Unit 2
COMPUTER ORGANIZATION NOTES  Unit 2COMPUTER ORGANIZATION NOTES  Unit 2
COMPUTER ORGANIZATION NOTES Unit 2Dr.MAYA NAYAK
 
Architecture of 8086
Architecture of 8086Architecture of 8086
Architecture of 8086MOHAN MOHAN
 
Register transfer and microoperations
Register transfer and microoperationsRegister transfer and microoperations
Register transfer and microoperationsmahesh kumar prajapat
 
Bca 2nd sem-u-2.1-overview of register transfer, micro operations and basic c...
Bca 2nd sem-u-2.1-overview of register transfer, micro operations and basic c...Bca 2nd sem-u-2.1-overview of register transfer, micro operations and basic c...
Bca 2nd sem-u-2.1-overview of register transfer, micro operations and basic c...Rai University
 
Addressing mode of 80286 microprocessor
Addressing mode of 80286 microprocessorAddressing mode of 80286 microprocessor
Addressing mode of 80286 microprocessorpal bhumit
 
Evaluation of High Speed and Low Memory Parallel Prefix Adders
Evaluation of High Speed and Low Memory Parallel Prefix AddersEvaluation of High Speed and Low Memory Parallel Prefix Adders
Evaluation of High Speed and Low Memory Parallel Prefix AddersIOSR Journals
 
Address translation-mechanism-of-80386 by aniket bhute
Address translation-mechanism-of-80386 by aniket bhuteAddress translation-mechanism-of-80386 by aniket bhute
Address translation-mechanism-of-80386 by aniket bhuteAniket Bhute
 
central processing unit and pipeline
central processing unit and pipelinecentral processing unit and pipeline
central processing unit and pipelineRai University
 
Register transfer and micro-operation
Register transfer and micro-operationRegister transfer and micro-operation
Register transfer and micro-operationNikhil Pandit
 

What's hot (20)

Memory Reference Instructions
Memory Reference InstructionsMemory Reference Instructions
Memory Reference Instructions
 
09 chapter04 timers_fa14
09 chapter04 timers_fa1409 chapter04 timers_fa14
09 chapter04 timers_fa14
 
Memory Reference Instructions
Memory Reference InstructionsMemory Reference Instructions
Memory Reference Instructions
 
Computer arithmetics (computer organisation & arithmetics) ppt
Computer arithmetics (computer organisation & arithmetics) pptComputer arithmetics (computer organisation & arithmetics) ppt
Computer arithmetics (computer organisation & arithmetics) ppt
 
Chapter 2 instructions language of the computer
Chapter 2 instructions language of the computerChapter 2 instructions language of the computer
Chapter 2 instructions language of the computer
 
Cao
CaoCao
Cao
 
Pipeline and data hazard
Pipeline and data hazardPipeline and data hazard
Pipeline and data hazard
 
VTU 4TH SEM CSE MICROPROCESSORS SOLVED PAPERS OF JUNE-2014 & JUNE-2015
VTU 4TH SEM CSE MICROPROCESSORS SOLVED PAPERS OF JUNE-2014 & JUNE-2015VTU 4TH SEM CSE MICROPROCESSORS SOLVED PAPERS OF JUNE-2014 & JUNE-2015
VTU 4TH SEM CSE MICROPROCESSORS SOLVED PAPERS OF JUNE-2014 & JUNE-2015
 
COMPUTER ORGANIZATION NOTES Unit 2
COMPUTER ORGANIZATION NOTES  Unit 2COMPUTER ORGANIZATION NOTES  Unit 2
COMPUTER ORGANIZATION NOTES Unit 2
 
Architecture of 8086
Architecture of 8086Architecture of 8086
Architecture of 8086
 
Ch2 csda
Ch2 csdaCh2 csda
Ch2 csda
 
Register transfer and microoperations
Register transfer and microoperationsRegister transfer and microoperations
Register transfer and microoperations
 
Bca 2nd sem-u-2.1-overview of register transfer, micro operations and basic c...
Bca 2nd sem-u-2.1-overview of register transfer, micro operations and basic c...Bca 2nd sem-u-2.1-overview of register transfer, micro operations and basic c...
Bca 2nd sem-u-2.1-overview of register transfer, micro operations and basic c...
 
Addressing mode of 80286 microprocessor
Addressing mode of 80286 microprocessorAddressing mode of 80286 microprocessor
Addressing mode of 80286 microprocessor
 
Evaluation of High Speed and Low Memory Parallel Prefix Adders
Evaluation of High Speed and Low Memory Parallel Prefix AddersEvaluation of High Speed and Low Memory Parallel Prefix Adders
Evaluation of High Speed and Low Memory Parallel Prefix Adders
 
Tcp ip
Tcp ipTcp ip
Tcp ip
 
Address translation-mechanism-of-80386 by aniket bhute
Address translation-mechanism-of-80386 by aniket bhuteAddress translation-mechanism-of-80386 by aniket bhute
Address translation-mechanism-of-80386 by aniket bhute
 
Memory reference
Memory referenceMemory reference
Memory reference
 
central processing unit and pipeline
central processing unit and pipelinecentral processing unit and pipeline
central processing unit and pipeline
 
Register transfer and micro-operation
Register transfer and micro-operationRegister transfer and micro-operation
Register transfer and micro-operation
 

Similar to Ca unit v 27 9-2020

Paper id 25201467
Paper id 25201467Paper id 25201467
Paper id 25201467IJRAT
 
A new cryptosystem with four levels of encryption and parallel programming
A new cryptosystem with four levels of encryption and parallel programmingA new cryptosystem with four levels of encryption and parallel programming
A new cryptosystem with four levels of encryption and parallel programmingcsandit
 
A NEW CRYPTOSYSTEM WITH FOUR LEVELS OF ENCRYPTION AND PARALLEL PROGRAMMING
A NEW CRYPTOSYSTEM WITH FOUR LEVELS OF ENCRYPTION AND PARALLEL PROGRAMMINGA NEW CRYPTOSYSTEM WITH FOUR LEVELS OF ENCRYPTION AND PARALLEL PROGRAMMING
A NEW CRYPTOSYSTEM WITH FOUR LEVELS OF ENCRYPTION AND PARALLEL PROGRAMMINGcscpconf
 
DESIGN OF DOUBLE PRECISION FLOATING POINT MULTIPLICATION ALGORITHM WITH VECTO...
DESIGN OF DOUBLE PRECISION FLOATING POINT MULTIPLICATION ALGORITHM WITH VECTO...DESIGN OF DOUBLE PRECISION FLOATING POINT MULTIPLICATION ALGORITHM WITH VECTO...
DESIGN OF DOUBLE PRECISION FLOATING POINT MULTIPLICATION ALGORITHM WITH VECTO...jmicro
 
Pipelining and vector processing
Pipelining and vector processingPipelining and vector processing
Pipelining and vector processingKamal Acharya
 
Implementation of an Effective Self-Timed Multiplier for Single Precision Flo...
Implementation of an Effective Self-Timed Multiplier for Single Precision Flo...Implementation of an Effective Self-Timed Multiplier for Single Precision Flo...
Implementation of an Effective Self-Timed Multiplier for Single Precision Flo...IRJET Journal
 
A High Speed Transposed Form FIR Filter Using Floating Point Dadda Multiplier
A High Speed Transposed Form FIR Filter Using Floating Point Dadda MultiplierA High Speed Transposed Form FIR Filter Using Floating Point Dadda Multiplier
A High Speed Transposed Form FIR Filter Using Floating Point Dadda MultiplierIJRES Journal
 
loaders-and-linkers.pdfhhhhhccftyghgfggy
loaders-and-linkers.pdfhhhhhccftyghgfggyloaders-and-linkers.pdfhhhhhccftyghgfggy
loaders-and-linkers.pdfhhhhhccftyghgfggyrahulyadav957181
 
Fpga sotcore architecture for lifting scheme revised
Fpga sotcore architecture for lifting scheme revisedFpga sotcore architecture for lifting scheme revised
Fpga sotcore architecture for lifting scheme revisedijcite
 
Implementation of an arithmetic logic using area efficient carry lookahead adder
Implementation of an arithmetic logic using area efficient carry lookahead adderImplementation of an arithmetic logic using area efficient carry lookahead adder
Implementation of an arithmetic logic using area efficient carry lookahead adderVLSICS Design
 
Secure Text Transfer Using Diffie-Hellman Key Exchange Based On Cloud
Secure Text Transfer Using Diffie-Hellman Key Exchange Based On CloudSecure Text Transfer Using Diffie-Hellman Key Exchange Based On Cloud
Secure Text Transfer Using Diffie-Hellman Key Exchange Based On CloudIRJET Journal
 
High Speed and Area Efficient Matrix Multiplication using Radix-4 Booth Multi...
High Speed and Area Efficient Matrix Multiplication using Radix-4 Booth Multi...High Speed and Area Efficient Matrix Multiplication using Radix-4 Booth Multi...
High Speed and Area Efficient Matrix Multiplication using Radix-4 Booth Multi...IRJET Journal
 
MapReduce: Ordering and Large-Scale Indexing on Large Clusters
MapReduce: Ordering and  Large-Scale Indexing on Large ClustersMapReduce: Ordering and  Large-Scale Indexing on Large Clusters
MapReduce: Ordering and Large-Scale Indexing on Large ClustersIRJET Journal
 
user_defined_functions_forinterpolation
user_defined_functions_forinterpolationuser_defined_functions_forinterpolation
user_defined_functions_forinterpolationsushanth tiruvaipati
 
A Wallace Tree Approach for Data Aggregation in Wireless Sensor Nodes
A Wallace Tree Approach for Data Aggregation in Wireless Sensor Nodes A Wallace Tree Approach for Data Aggregation in Wireless Sensor Nodes
A Wallace Tree Approach for Data Aggregation in Wireless Sensor Nodes ijcisjournal
 

Similar to Ca unit v 27 9-2020 (20)

Paper id 25201467
Paper id 25201467Paper id 25201467
Paper id 25201467
 
A new cryptosystem with four levels of encryption and parallel programming
A new cryptosystem with four levels of encryption and parallel programmingA new cryptosystem with four levels of encryption and parallel programming
A new cryptosystem with four levels of encryption and parallel programming
 
A NEW CRYPTOSYSTEM WITH FOUR LEVELS OF ENCRYPTION AND PARALLEL PROGRAMMING
A NEW CRYPTOSYSTEM WITH FOUR LEVELS OF ENCRYPTION AND PARALLEL PROGRAMMINGA NEW CRYPTOSYSTEM WITH FOUR LEVELS OF ENCRYPTION AND PARALLEL PROGRAMMING
A NEW CRYPTOSYSTEM WITH FOUR LEVELS OF ENCRYPTION AND PARALLEL PROGRAMMING
 
DESIGN OF DOUBLE PRECISION FLOATING POINT MULTIPLICATION ALGORITHM WITH VECTO...
DESIGN OF DOUBLE PRECISION FLOATING POINT MULTIPLICATION ALGORITHM WITH VECTO...DESIGN OF DOUBLE PRECISION FLOATING POINT MULTIPLICATION ALGORITHM WITH VECTO...
DESIGN OF DOUBLE PRECISION FLOATING POINT MULTIPLICATION ALGORITHM WITH VECTO...
 
Pipelining and vector processing
Pipelining and vector processingPipelining and vector processing
Pipelining and vector processing
 
Implementation of an Effective Self-Timed Multiplier for Single Precision Flo...
Implementation of an Effective Self-Timed Multiplier for Single Precision Flo...Implementation of an Effective Self-Timed Multiplier for Single Precision Flo...
Implementation of an Effective Self-Timed Multiplier for Single Precision Flo...
 
Cisco Activity
Cisco ActivityCisco Activity
Cisco Activity
 
A High Speed Transposed Form FIR Filter Using Floating Point Dadda Multiplier
A High Speed Transposed Form FIR Filter Using Floating Point Dadda MultiplierA High Speed Transposed Form FIR Filter Using Floating Point Dadda Multiplier
A High Speed Transposed Form FIR Filter Using Floating Point Dadda Multiplier
 
Ek35775781
Ek35775781Ek35775781
Ek35775781
 
40120130406011 2-3
40120130406011 2-340120130406011 2-3
40120130406011 2-3
 
loaders-and-linkers.pdfhhhhhccftyghgfggy
loaders-and-linkers.pdfhhhhhccftyghgfggyloaders-and-linkers.pdfhhhhhccftyghgfggy
loaders-and-linkers.pdfhhhhhccftyghgfggy
 
Fpga sotcore architecture for lifting scheme revised
Fpga sotcore architecture for lifting scheme revisedFpga sotcore architecture for lifting scheme revised
Fpga sotcore architecture for lifting scheme revised
 
Unit 4 COA.pptx
Unit 4 COA.pptxUnit 4 COA.pptx
Unit 4 COA.pptx
 
Implementation of an arithmetic logic using area efficient carry lookahead adder
Implementation of an arithmetic logic using area efficient carry lookahead adderImplementation of an arithmetic logic using area efficient carry lookahead adder
Implementation of an arithmetic logic using area efficient carry lookahead adder
 
Secure Text Transfer Using Diffie-Hellman Key Exchange Based On Cloud
Secure Text Transfer Using Diffie-Hellman Key Exchange Based On CloudSecure Text Transfer Using Diffie-Hellman Key Exchange Based On Cloud
Secure Text Transfer Using Diffie-Hellman Key Exchange Based On Cloud
 
40120130405014
4012013040501440120130405014
40120130405014
 
High Speed and Area Efficient Matrix Multiplication using Radix-4 Booth Multi...
High Speed and Area Efficient Matrix Multiplication using Radix-4 Booth Multi...High Speed and Area Efficient Matrix Multiplication using Radix-4 Booth Multi...
High Speed and Area Efficient Matrix Multiplication using Radix-4 Booth Multi...
 
MapReduce: Ordering and Large-Scale Indexing on Large Clusters
MapReduce: Ordering and  Large-Scale Indexing on Large ClustersMapReduce: Ordering and  Large-Scale Indexing on Large Clusters
MapReduce: Ordering and Large-Scale Indexing on Large Clusters
 
user_defined_functions_forinterpolation
user_defined_functions_forinterpolationuser_defined_functions_forinterpolation
user_defined_functions_forinterpolation
 
A Wallace Tree Approach for Data Aggregation in Wireless Sensor Nodes
A Wallace Tree Approach for Data Aggregation in Wireless Sensor Nodes A Wallace Tree Approach for Data Aggregation in Wireless Sensor Nodes
A Wallace Tree Approach for Data Aggregation in Wireless Sensor Nodes
 

Recently uploaded

DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxChelloAnnAsuncion2
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomnelietumpap1
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 

Recently uploaded (20)

DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choom
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 

Ca unit v 27 9-2020

  • 1. 13-11-2020 1Dr. K.K. THYAGHARAJAN Contact E-mail: acdean@rmd.ac.in kkthyagharajan@yahoo.com kkthyagharajan@gmail.com Dr. K.K. THYAGHARAJAN Professor & Dean (Academic) Department of Electronics and Communication Engineering RMD ENGINEERING COLLEGE PROCESSOR ORGANIZATION Click on the links given below to view videos. UNIT – I Computer Architecture PPT-PDF http://dx.doi.org/10.13140/RG.2.2.28687.20643 https://youtu.be/DcMM_dIxWEE https://youtu.be/JoSONsTuopk UNIT – II Computer Architecture PPT-PDF http://dx.doi.org/10.13140/RG.2.2.36236.95363 https://youtu.be/thC8B4B-PyY https://youtu.be/m7JtcP5QmFA https://youtu.be/NbfTKSm4ubM https://youtu.be/RhiBtztCESI UNIT – IV Computer Architecture & Microprocessors PPT-PDF http://dx.doi.org/10.13140/RG.2.2.20718.02880 https://youtu.be/LroA8T-_vqs https://youtu.be/CU1wx8EZmvc https://youtu.be/zYADaZ5sfY0 https://youtu.be/GuC7sZEw-uM
  • 2. 13-11-2020 2Dr. K.K. THYAGHARAJAN Dr. K.K. THYAGHARAJAN Contact E-mail: acdean@rmd.ac.in kkthyagharajan@yahoo.com kkthyagharajan@gmail.com PROCESSOR ORGANIZATION FLYNN’S CLASSIFICATION OF MULTIPROCESSOR ORGANIZATION This video explains SISD, SIMD, MISD and MIMD classification
  • 3. 13-11-2020 3Dr. K.K. THYAGHARAJAN Flynn’s Classification of Processor Organization 1. SISD (Single Instruction stream Single Data stream) 2. SIMD (Single Instruction stream Multiple Data stream) 3. MISD (Multiple Instruction stream Single Data stream) 4. MIMD (Multiple Instruction stream Multiple Data stream) The processor organization deals with how the parts such as control units and processing elements (ALU) of the processor are linked together in a multiprocessor to improve the performance.
  • 4. 13-11-2020 4Dr. K.K. THYAGHARAJAN Flynn’s Classification of Processor Organization 1. SISD 2. SIMD Control Unit ALU Main Memory IS DS IS  SISD has a single control unit and gets a single instruction from the main memory at a time.  It has one processing Element (ALU) and uses one data stream connected with the main memory.  Processing unit may have more functional units (add, multiply, load etc.)  Instruction stream (Is)= Data stream (Ds) = 1  SIMD is more suitable for handling arrays in the for loops and data parallelism is achieved  SIMD gets only one copy of the code from the main memory and this operation is performed on multiple independent data obtained by the main memory as shown in the figure.  Reduces the instruction bandwidth and space  Not suitable for the case … switch statements because the execution unit (Processing unit) must perform different operations based on data  The processor should complete the current instruction before it takes the next instruction i.e. execution of the instruction is synchronous PEn PE2Control Unit PE1 MM- Main Memory IS DS1 IS DS2 DSn … MM1 MM2 MMn IS IS IS … PE – Processing Element
  • 5. 13-11-2020 5Dr. K.K. THYAGHARAJAN Flynn’s Classification of Processor Organization 3. MISD PEn PE2 PE1 MM- Main Memory … IS1 IS2 ISn PE – Processing Element CU – Control Unit Is >1 and Ds = 1  Multiple control units are used to control multiple processing units  Each control unit is handling one instruction and process it through its corresponding process element  Only one data stream is passing through all processing elements at a time from a common shared memory CUn CU2 CU1 … DS DS ISn IS2 IS1
  • 6. 13-11-2020 6Dr. K.K. THYAGHARAJAN Flynn’s Classification of Processor Organization 4. MIMD PEn PE2 PE1 … IS1 IS2 ISn PE – Processing Element CU – Control Unit Is >1 and Ds >1  Multiple control units are used to handle multiple instructions at same time  Multiple processing elements are used with separate data stream drawn from main memory for each processing element  Each process works on its own instruction and own data  Task executed by different processes are asynchronous i.e. each task can start or finish at different times  This organization actually represents a real parallel computer - Example Graphics Processing Unit (GPU) CUn CU2 CU1 … DS1 ISn IS2 IS1 MM1 MM2 MMn … Mainmemory DS2 DSn
  • 7. 13-11-2020 7Dr. K.K. THYAGHARAJAN  SIMD is a vector architecture and it is used for data level parallelism.  In vector architecture data are collected from memory, put them in proper order into a set of registers. These registers are operated sequentially using pipelined execution units. The results are written back to memory.  If two vectors A and B are to be added and the result is to be stored in the vector C, then it can be written as C = A + B . A(1), A(2) etc are vector elements  Figure 1 shows a single ‘add’ pipeline to add vectors A and B and stores the result in vector C. Here one addition is performed per cycle Figure 2 SIMD (Single Instruction stream Multiple Data stream) Figure 1 Figure 3  Figure 2 uses four add pipelines or lanes. So, it completes four additions per cycle.  The number of clocks required to execute a vector addition is reduced by a factor of 4.  Each vector lane uses a portion of vector register. Figure 3 uses four lanes. Each vector lane has more than one functional units. Three functional units FP add, FP Multiply, and a load store unit are provided The elements in a single vector are interleaved across four lanes, Each lane uses a portion of vector register. The vector storage is divided across four lanes and each lane holds every fourth element of each vector register
  • 8. 13-11-2020 8Dr. K.K. THYAGHARAJAN SIMD (Single Instruction stream Multiple Data stream)  Old array processors use 64 ALUs for doing 64 additions simultaneously  But SIMD vector architecture use less number of ALUs (even one) and passes the data through lanes and pipelines. This reduces the hardware cost.  In MIPS vector architecture, 32 vector registers are provided and each register will point 64 vector elements each of 64 bit size.  Hardware gets the addresses of the vector from these vector vector-registers. This indexed accesses are called as gather scatter.  The data need not be contiguous in main memory. Indexed load instructions gather data from the main memory and put them in contiguous vector elements.  Indexed store instruction scatters vector elements across main memory.  The number of elements in a vector operation is not in the instruction or opcode, it is in a separate register  MIPS vector instructions are obtained by appending letter v to MIPS instructions for example  addv.d # adds two double precision vectors. This instruction accesses the inputs using two vector registers  addvs.d # This instruction takes one input from a scalar register and another input using a vector register. The scalar input is added with each element in the vector.  lv # load vector (double precision data)  sv # store vector (double precision data)
  • 9. 13-11-2020 9Dr. K.K. THYAGHARAJAN SIMD (Single Instruction stream Multiple Data stream) Problem: Write an MIPS program using vector instructions to solve Y = a*X + Y Where X and Y are vectors (arrays) with 64 double precision floating point numbers i.e. 64 numbers of 64- bit size each stored in the memory. Assume that the starting address of X is in $s0 and starting address of Y is in $s1 Solution: l.d $f0, a($sp) # load scalar ‘a’ into f0 register lv $v1, 0($s0) # load vector X pointed by register s0 into v1 register mulvs.d $v2, $v1, $f0 # multiply vector v1 by scalar f0 and store the result in vector v2 lv $v3, 0($s1) # load vector Y in v3 addv.d $v4, $v2, $v3 # add vector Y (in v3) to the product (in v2) and store the result in vector v4 sv $v4, 0($s1) # store the result v4 in the vector Y (pointed by register s1)
  • 10. 13-11-2020 10Dr. K.K. THYAGHARAJAN Contact E-mail: acdean@rmd.ac.in kkthyagharajan@yahoo.com kkthyagharajan@gmail.com Dr. K.K. THYAGHARAJAN Professor & Dean (Academic) Department of Electronics and Communication Engineering RMD ENGINEERING COLLEGE Multithreading
  • 11. 13-11-2020 11Dr. K.K. THYAGHARAJAN Dr. K.K. THYAGHARAJAN Contact E-mail: acdean@rmd.ac.in kkthyagharajan@yahoo.com kkthyagharajan@gmail.com Multithreading Hardware Multithreading This video explains three types of hardware multithreading
  • 12. 13-11-2020 12Dr. K.K. THYAGHARAJAN 1. Hardware Multithreading Thread: Thread is a sequence of instructions that can run independently from other programs. When a sequence of instructions are being executed, the processor may have to wait if the next instruction or data is not available. This is called stalling. Instead of waiting, the processor may switch to another thread , execute it and come back to this thread. Multithreading: Switching from one thread (stalled thread) to another thread is known as multithreading. All the threads generally share a single address space using a program counter, and stack and register states. Process: A process includes one or more threads and their address spaces. Switching from one process to another process invokes the operating system (OS). The main difference between multithreading and the process: Multithreading uses single address space and it does not invoke the OS. But, a process switches from threads at different address spaces and requires the help of the operating system to do this switching. So, you can say that multithreading is a lightweight process and it is smaller than the process. Types of multithreading: Fine-grained multithreading Coarse-grained multithreading Simultaneous multithreading
  • 13. 13-11-2020 13Dr. K.K. THYAGHARAJAN 1.1 Fine-grained multithreading  Switching between threads happens on each instruction  Switching happens on every clock cycle  Switching is done in a round-robin fashion as shown in figure 1 i.e. 1st instruction from the 1st thread is executed then 1st instruction from the 2nd thread is executed and it continues until the 1st instruction in the last thread completes. Then the 2nd instruction from the 1st thread is executed, 2nd instruction from the 2nd thread is executed and so on. Advantage: If any thread is stalled, it is skipped and the next thread continues . This is called interleaving and this approach improves the throughput. Disadvantage: Threads that are ready to execute the next instruction without stalls should wait until other threads are over. This slows down the execution of individual threads. 1. Hardware Multithreading 1.2. Coarse-grained multithreading Threads are switched only when costly stalls (i.e. last-level cache miss) occur. So, frequent thread switching is avoided and hence slow down of individual thread execution is avoided. Disadvantage: The new thread starts execution only after the pipeline is filled up and the current instructions completes its execution. This is called start-up overhead. Advantage: The processor issues instructions from the same thread when shorter stalls occur. So pipeline may be required to be emptied or frozen. However, the time taken in this case is less compared to the pipeline start-up time and hence the throughput cost is minimized. This approach reduces the penalty of high-cost stalls because the pipeline refill time is negligible compared to stall time.
  • 14. 13-11-2020 14Dr. K.K. THYAGHARAJAN 1.3. Simultaneous multithreading (SMT)  Multiple instructions are executed from independent multiple threads using register renaming. If the threads are dependent, the dependencies are handled by dynamic scheduling. But it does not switch resources every clock cycle.  SMT processors have dynamically scheduled pipelines for thread level parallelism and instruction level parallelism. These processors have more functional units to implement parallelism.  Intel call this as hyper-threading and AMD calls this as SMT 1. Hardware Multithreading Figure 1 shows three threads that will execute independently with stalling. The empty rows indicate unused clock cycle or stall. One row in each thread is issued to pipe line in each clock cycle Figure 2a shows how the three threads shown in figure 1 are executed when fine grained multithreading is applied Figure 2b shows how the three threads shown in figure 1 are executed when coarse grained multithreading is applied Figure 2c shows how the three threads shown in figure 1 are executed when simultaneous multithreading is applied
  • 15. 13-11-2020 15Dr. K.K. THYAGHARAJAN Thread A Thread B Thread C Costly Stall Costly Stall Costly Stall Costly Stall Costly Stall Costly Stall Costly Stall Time (2a) Fine Grained MT (2b) Coarse Grained MT (2c) Simultaneous MT Time Figure 1: Independent threads Figure 2: The above three threads executed with Multithreading
  • 16. 13-11-2020 16Dr. K.K. THYAGHARAJAN Contact E-mail: acdean@rmd.ac.in kkthyagharajan@yahoo.com kkthyagharajan@gmail.com Dr. K.K. THYAGHARAJAN Professor & Dean (Academic) Department of Electronics and Communication Engineering RMD ENGINEERING COLLEGE Multiprocessing
  • 17. 13-11-2020 17Dr. K.K. THYAGHARAJAN Dr. K.K. THYAGHARAJAN Contact E-mail: acdean@rmd.ac.in kkthyagharajan@yahoo.com kkthyagharajan@gmail.com Multiprocessing Systems Multicore processor & Multiprocessor KKT
  • 18. 13-11-2020 18Dr. K.K. THYAGHARAJAN 1. Multiprocessing Systems 1.Multiprocessor system 1.1 Shared memory System (Tightly coupled system) 1.1.1 Uniform memory Access (UMA) 1.1.2 Non-uniform memory Access (NUMA) 1.2 Distributed memory System (Loosely Coupled System) Cluster Two Types of Multiprocessing:  More than one processors are in a single chip - Multicore processor – Quad core processor can handle 4 threads, Octa-core can handle 8 threads – more cores better multiprocessing- uses on-chip network to connect the processors  More than one processors are connected in a single system - Multiprocessor system - uses Local Area Network (LAN) to connect systems
  • 19. 13-11-2020 19Dr. K.K. THYAGHARAJAN 1.1 Shared Memory (tightly coupled) System • All the processors share a single global memory. The global memory may be divided into many modules, but single address space is used. (e.g. Multicore processor) • Processors communicate using shared locations (variables) in the global memory. • These shared data are coordinated using locks (synchronization primitives) which allow data to be accessed by only one processor at a time. • Shared memory system use a common bus or a cross bar or a multistage network to connect processors, memory and I/O devices • Programs stored in the virtual address space of each processor can run independently. • This is used for high speed real-time processing and provides high throughput compared to loosely coupled systems.
  • 20. 13-11-2020 20Dr. K.K. THYAGHARAJAN 1.1.1 Uniform memory Access (UMA) System UMA systems are divided in to two 1. Symmetric UMA (SUMA) 2. Asymmetric UMA (AUMA) In the case of SUMA all processors are identical. Processors may have local cache memories and I/O devices. Physical memory is uniformly shared by all processors with equal access time to all words. In the case of AUMA, one master processor executes the operating system and other processors may be dedicated to special tasks such as graphics rendering, doing mathematical functions etc. Processor 1 Cache Processor 2 Cache Processor n Cache Interconnection Network Memory I/O
  • 21. 13-11-2020 21Dr. K.K. THYAGHARAJAN 1.1.2 Non-uniform Memory Access (NUMA) System In NUMA system each processor may have a local memory. Local memories will have its own private program and private data. The collection of all local memories form the global memory i.e. local memory of one processor may be accessed by another processor using shared variables. The time taken for accessing the local memory of one processor by another remote processor is not uniform. It depends on the location of the processor and the memory. NUMAs can scale to larger size with lower latency (access time) to local memory. Interrupt Signal Interconnection Network Processor-Memory Interconnection Network P1 P2 Pn M1 M2 Mn I/O processor Interconnection Network D1 D2 Dn I/O channels
  • 22. 13-11-2020 22Dr. K.K. THYAGHARAJAN 1.2 Distributed Memory System (DMS) (Loosely Coupled System) • DMS systems do not use a global shared memory. • Use of global memory creates memory conflicts and slows down the execution. • DMS has multiple processors and each processor has a large local memory and a set of I/O devices, which are not shared by any other processor. So, this system is called distributed multicomputer system. • The group of computers connected together is called a cluster and each computer is called a node. • These computers communicate with each other by passing messages through an interconnection network. • To pass messages to other computers in the cluster ‘send message’ routine is used • To receive messages from other computers in the cluster ‘receive message’ routine is used
  • 23. 13-11-2020 23Dr. K.K. THYAGHARAJAN 1.2 Distributed Memory System (Loosely Coupled System) LM1 P1 Node LM2 P2 Node LMn Pn Node Message Passing Inter- Connection Network LM – Local Memory P - Processor
  • 24. 13-11-2020 24Dr. K.K. THYAGHARAJAN Contact E-mail: acdean@rmd.ac.in kkthyagharajan@yahoo.com kkthyagharajan@gmail.com Dr. K.K. THYAGHARAJAN Professor & Dean (Academic) Department of Electronics and Communication Engineering RMD ENGINEERING COLLEGE Multiprocessor Network Topologies
  • 25. 13-11-2020 25Dr. K.K. THYAGHARAJAN Dr. K.K. THYAGHARAJAN Contact E-mail: acdean@rmd.ac.in kkthyagharajan@yahoo.com kkthyagharajan@gmail.com Multiprocessor Network Topologies Multiprocessor Network Topologies KKT
  • 26. 13-11-2020 26Dr. K.K. THYAGHARAJAN Bus Topology o Showing how the nodes (processor & memory) are connected is called the network topology. o Multicore chips require on-chip networks to connect cores together o Clusters require local area networks to connect servers together o Networks are drawn as graphs and edges of the graph represents links of the communication network o Nodes (computers or processor-memory-nodes) are connected to this graph through network switches . o In the following diagrams coloured circles represent switches and black squares represent processor-memory-nodes o Network costs include the number of switches, number of links on a switch to connect, the length of the links, the width (number of bits) per link 1. Uses a shared set of wires that allows broadcasting messages to all nodes at the same time 2. Bandwidth of the network = bandwidth of the bus Multiprocessor Network Topologies Ring Topology 1. Messages will have to travel along the intermediate nodes until they arrive at the final destination. A ring is capable of many simultaneous transfers 2. Bandwidth of the network = Bandwidth of each link x number of links 3. Ring is a fully connected network. Every processor (P) has a bidirectional link to every other processor 4. If a link is as fast as the bus, then a ring is P times faster than bus in the best- case. 5. The total bandwidth = P x (P-1)/2 6. Bisection bandwidth = (P/2)2 Bisection bandwidth is calculated by dividing the machines into two halves switch node link
  • 27. 13-11-2020 27Dr. K.K. THYAGHARAJAN • Instead of placing a processor at every node in a network, a switch is placed at some of these nodes. • Switches are smaller than processor-memory-switch nodes In Star topology all nodes are connected to a central device called hub using a point-to-point connection Multiprocessor Network Topologies STAR TOPLOGY
  • 28. 13-11-2020 28Dr. K.K. THYAGHARAJAN Multiprocessor Network Topologies Boolean Cube tree network n=3 for cube So n (here n=3) links per switch are used in these networks 23 = 8 i.e. 2n nodes are connected. One link goes to the processor Switch Node 2D Grid or Mesh network Here n=2 So n (here n=2) links per switch are used in these networks One link goes to the processor
  • 29. 13-11-2020 29Dr. K.K. THYAGHARAJAN Multistage networks: Messages can travel in multiple steps 1. Fully connected or crossbar networks – Any node can communicate with any other node in one pass through the network. 2. Omega Networks: Uses less hardware than the crossbar network, bus contention may occur between messages Multiprocessor Network Topologies Crossbar Network: n= number of processors = 8 Number of switches = n2 = 64 Any node can communicate with any other node in one pass through the network
  • 30. 13-11-2020 30Dr. K.K. THYAGHARAJAN Omega Networks: Uses less hardware than the crossbar network. There are 12 switch boxes in the network shown in the figure. Each switch box has 4 smaller switches. Number of switches used = 12x 4 = 48 It is given by the formula 2nlog2n , here n = number of processors = 8 So, number of switches used = 2x8x log28 = 2x8x3=48 as given above. This network can not support all combinations for message passing, bus contention may occur between messages. i.e. P0 can not send messages to P6 because these two are not connected If P1 sends a message to P4 , then P0 may not send messages to P4 or P5 at the same time. Multiprocessor Network Topologies Omega Network Switch Box A B C D
  • 31. 13-11-2020 31Dr. K.K. THYAGHARAJAN Contact E-mail: acdean@rmd.ac.in kkthyagharajan@yahoo.com kkthyagharajan@gmail.com Dr. K.K. THYAGHARAJAN Professor & Dean (Academic) Department of Electronics and Communication Engineering RMD ENGINEERING COLLEGE Graphics Processing Unit (GPU)
  • 32. 13-11-2020 32Dr. K.K. THYAGHARAJAN Dr. K.K. THYAGHARAJAN Contact E-mail: acdean@rmd.ac.in kkthyagharajan@yahoo.com kkthyagharajan@gmail.com Graphics Processing Unit (GPU) Architecture of GPU (MIMD Processor) KKT
  • 33. 13-11-2020 33Dr. K.K. THYAGHARAJAN Graphics Processing Unit (GPU) Central Processing Unit (CPU) 1. CPU has general purpose instructions and they are more suitable for serial processing 2. CPUs will have just few cores with cache and can handle only few threads at a time 3. CPU has a large main memory which is oriented toward low latency 4. CPU is mainly designed for instruction level parallelism Graphics Processing Unit (GPU) 1. GPU has its own special instructions to handle graphics and more suitable for parallel processing. 2. GPUs have hundreds of cores and can handle thousands of threads simultaneously 3. GPU has a separate large main memory which is oriented toward bandwidth rather than latency and provides high throughput 4. GPU is designed for data level parallelism o GPU is a processor specially designed for handling graphics rendering tasks. o GPU is used to accelerate the processing in video editing, video game rendering, 3D modelling (AUTO CAD), AI based tasks etc. o GPU breaks complex problems into many tasks and work on them parallelly. o GPUs are highly multithreaded.
  • 34. 13-11-2020 34Dr. K.K. THYAGHARAJAN Graphics Processing Unit (GPU)  NVIDIA is an American company. They developed ‘Compute Unified Device Architecture’ (CUDA) for graphics processing units and its commercial name is Fermi  GeForce is a brand of graphics processing units designed by NVIDIA  GPU contains a collection of multithreaded SIMD processors and hence it is a MIMD processor.  In the Fermi architecture, The number of SIMD processors will be 7 or 11 or 14 or 15. GPU Architecture - NVIDIA  A CUDA program calls kernels parallelly. Click to view the diagram  The GPU executes a kernel on a grid, a grid has many thread blocks. The thread block consists of many threads which will be executed parallelly (figure 2).  Each thread with in the thread block is called as machine object and it has a thread ID, program instructions, program counter, registers, per-thread private memory, inputs and outputs.  The machine object is created , managed, scheduled and executed by the GPU.  The GPU has two schedulers 1. Thread Block Scheduler: Assigns blocks of threads to multithreaded SIMD processors. 2. SIMD Thread Scheduler: This is available within the SIMD processor and it has a controller. It identifies the threads that are ready run , schedules them for executing and sends them to the dispatcher when needed.
  • 35. 13-11-2020 35Dr. K.K. THYAGHARAJAN Graphics Processing Unit (GPU)  The off-chip DRAM is shared by all thread blocks. It is called global memory or GPU memory  The local memory of each SIMD processor is shared by its lanes within the processor, but it is not shared between two SIMD processors  7 or 11 or 14 or 15 SIMD processors are used in different Fermi architectures. Figure 1: Multithreaded SIMD Processor Figure 2: GPU Memory Structure Thread BlockThreads Grid 0 Grid 1
  • 36. 13-11-2020 36Dr. K.K. THYAGHARAJAN Scheduling Parallel Processing Includes o Scheduling - a method used to share the resources for threads and processes, processor time , data lines and balance the load to achieve the quality o Partitioning the work into parallel pieces – task must be divided equally to all processors to avoid idle time o Balancing the load evenly between the processors – processors should not be idle for more time o Time required for Synchronization – processor should complete the allotted work in the specified time and should get the next work intime otherwise parallel processing will not be possible o Overhead for communication – Inter process communication has time overhead because it has to know which process has to communicate and which one need not communicate. Types of scheduling  Long term scheduling  Medium term scheduling  Short term scheduling  Dispatcher
  • 37. 13-11-2020 37Dr. K.K. THYAGHARAJAN Scheduling Problem: To achieve a speed-up of 90 times faster with 100 processors, what percentage of the original computation can be sequential? Amdahl’s Law Fraction time affected = execution time affected / execution time before Speed-up = 90 ; amount of improvement 100 Substitute in the above formula Fraction time affected = 0.999 So to achieve speed-up of 90 times from 100 processors, the sequential percentage can only be 0.1%
  • 38. 13-11-2020 38Dr. K.K. THYAGHARAJAN

Editor's Notes

  1. 13 November 2020
  2. 13 November 2020
  3. 13 November 2020
  4. 13 November 2020