SlideShare a Scribd company logo
1 of 45
UNIT IV – PARALLELISIM
Parallel processing challenges – Flynn„s
classification – SISD, MIMD, SIMD, SPMD,
and Vector Architectures - Hardware
multithreading – Multi-core processors and
other Shared Memory Multiprocessors -
Introduction to Graphics Processing Units,
Clusters, Warehouse Scale Computers and
other Message-Passing Multiprocessors
Introduction:
•Processing data concurrently is known as Parallel
Processing.
•Consider a multiprocessor system with ‗n‘ processors.
If a processor fails, the system would continue to
provide service with the remaining ‗n-1‘ processors.
•Parallelism is a mode of operation in which a process
is split into parts, which are executed simultaneously
on different processors attached to the same
computer.
•TwoWays:
•Multiple functional units – two or moreALUs
•Multiple processors – two or more processors
•Multiprocessor System
•Task-level parallelism or process-level parallelism
•Parallel processing program
•Cluster
Multicore:
• Architecture design that places multiple processors on a single
die(computer chip).
• Eg. Dual, Quad, Hexa, Octa.
Necessity:
• Reduce power consumption
• Cut cost
Goals of Parallelism:
•It increases the computational speed.
• It increases throughput by making two or more
ALUs in CPU can work concurrently.
[Throughput - amount of processing that can be
accomplished during a given interval of time]
•It improves the performance of the computer for a
given clock speed.
Types of Parallelism:
• Instruction level parallelism
•Thread level orTask level Parallelism
•Bit-level Parallelism
•Data level parallelism
•Transaction level parallelism
Instruction Level Parallelism
•When instructions in a sequence are independent and
can be executed in parallel, then there is an
Instruction Level Parallelism.
•Two primary methods are:
1. Increasing the depth of pipeline
2. Replicating the internal components.
1. Implementing a multiple issue processor
- Static and Dynamic
2. Speculation
- Approach to guess the properties of instruction
3. Recovery mechanisms
- Exception Handling
4. Instruction issue policy
- in-order issue with in-order completion
- in-order issue with out-order completion
- out-order issue with out-order completion
5. Register renaming
6. Branch prediction
Parallel Processing Challenges
•Challenges faced by industry is to create
hardware and software that will make it easy to
write correct parallel processing programs that will
execute efficiently in performance and energy.
•Challenges:
•Writing programs
•Scheduling
•Partitioning the task
•Balance the load between processors.
Parallel Processing Challenges
•Amdahl’s Law:
Amdahl’s law is used to calculate the performance
gain that can be obtained by improving some portion
of a computer.
𝑆𝑝𝑒𝑒𝑑 𝑢𝑝 =
1
1 − 𝐹𝑒 + (
𝐹𝑒
𝑆𝑒
)
Flynn’s Classification
•Flynn’s classification uses two basic concepts:
1. Parallelism in instruction stream and
2. Parallelism in data stream
•There are 4 possible combinations.
SISD (Single Instruction Single Data)
•A processor that can only do one job at a time from
start to finish.
SIMD (Single Instruction Multiple
Data)
•They have multiple processing/execution units and
one control unit.
•SPMD
MISD (Multiple Instruction Single
Data)
•There are N control and processor unit operating
over the same data stream and result of one
processor becomes input of the next processor.
MIMD (Multiple Instruction Multiple
Data)
•Most of the multiprocessors system and multiple
computers system come under this category.
•Multiple SISD(MSISD)
Vector Architecture
•Efficient method of SIMD.
•It collects data elements from memory, put them in
order into a large set of register, operate them
sequentially in registers and then write them results
back to memory.
Hardware Multithreading
•The instruction stream is divided into several smaller
streams called Threads.
•Otherwise it’s a high degree of instruction level
parallism.
Some terms:
•Process
•Resource ownership
•Scheduling /execution
•Process Switch
•Thread
•Thread Switch
•Two methods:
1. Explicit Multithreading
2. Implicit Multithreading
Explicit Multithreading
•Explicit Multithreading are visible to the application
programs and visible to operating system.
Implicit Multithreading
•Implicit Multithreading are not direct method.
Approaches to Explicit Multithreading
•Single-threaded scalar
•Interleaved or fine-grained multithreading
•Blocked or coarse-grained multithreading
•Simultaneous multithreading (SMT)
•Chip multiprocessing
•Interleaved multithreaded scalar
•Blocked multithreaded scalar
•Superscalar
•Interleaved multithreading superscalar
• Blocked multithreaded superscalar
•Very long instruction word(VLIW)
•Interleaved multithreading VLIW
•Blocked multithreading VLIW
•Simultaneous multithreading
•Chip multiprocessor
Multicore Processors and Other
Shared Memory Multiprocessors
•Multicore architecture are classified into 3 types:
1. Type 1 (Hyperthreading technology)
2. Type 2 (Classic Multiprocessor)
3. Type 3 (Multicore system)
Shared Memory Multiprocessor (SMP)
•SMP is one that offers the programmer a single
physical address space across all processor.
•Classified as:
1. Uniform memory access multiprocessor (UMA)
2. Non-Uniform memory access multiprocessor
(NUMA)
S.No. Key UMA NUMA
1 Definition
UMA stands for Uniform
Memory Access.
NUMA stands for Non Uniform
Memory Access.
2
Memory
Controller
UMA has single memory
controller.
NUMA has multiple memory
controllers.
3
Memory
Access
UMA memory access is slow.
NUMA memory accsss is faster
than UMA memory.
4 Bandwidth UMA has limited bandwidth.
NUMA has more bandwidth
than UMA.
5 Suitability
UMA is used in general purpose
and time sharing applications.
NUMA is used in real time and
time critical applications.
6
Memory
Access time
UMA has equal memory access
time.
NUMA has varying memory
access time.
7 Bus types
3 types of Buses supported:
Single, Multiple and Crossbar.
2 types of Buses supported:
Tree, hiearchical.
Graphics Processing Unit (GPU)
1. GPUs vs CPUs
•Programming interface to GPU are high-level
application programming interface (APIs) such as
DirectX, OpenGL, NVIDIA’s C for graphics etc..
•CPU supports sequential coding while GPU supports
parallel coding.
Graphics Processing Unit (GPU)
1. GPUs vs CPUs
•Programming interface to GPU are high-level
application programming interface (APIs) such as
DirectX, OpenGL, NVIDIA’s C for graphics etc..
•CPU supports sequential coding while GPU supports
parallel coding.
2. Connection between CPU and GPU
3. GPU Architecture
o SIMD
One instruction operates on multiple data.
o Multithreading
Most graphics have this property since they need to
process many objects. (pixels, vertices, polygons)
simultaneously.
o NIVIDIA GPU architecture
1. Motherboard GPUs integrated
2. Tesla-based GPUs – 900MHz, 128MB – DDR3 RAM
CUDA Programming
o Compute Unified Device Architecture
o CUDA is a parallel computing platform
and programming model developed by Nvidia for
general computing on its own GPUs (graphics
processing units).
o CUDA enables developers to speed up compute-
intensive applications by harnessing the power of
GPUs for the parallelizable part of the computation.
o Heterogeneous CPU and GPU System.
Message-passing multiprocessors
o With no shared memory space, the alternative
method to achieve multiprocessor is via explicit
message passing technique.
o This is done by establishing a communication
channel between two processor.
Shared memory multiprocessor
o A Shared memory multiprocessor is a computer
system composed of multiple independent
processors that execute different instruction
streams.
o Processor share a common memory address space
and communicate with each other via memory.
Clusters
o Clusters are collections of desktop computers or
servers connected by local area networks to act as
a single large computer.
Warehouse-Scale Computers
o Largest form of clusters are called Warehouse-
scale computers (WSCs)
o WSC provide internet services:
1. Google
2. Facebook
3. Youtube
4. Amazon
Goals and requirements with servers:
•Cost-performance
•Energy efficiency
•Dependability
•Network i/o
•Interactive
Characteristics not shared with servers:
•Ample parallelism
•Operational cost count
•Scale
Ques
o List four major groups of computes defined by
Michael J.Flynn
o State amdahl’s law.
o Define Parallel processing.
o What is Speculation?
o State Coarse grained multithreading.
o Write note on SIMD processor.
o Define VLIW.
o Compare UMA and NUMA multiprocessor.
o What is multicore processor?
Part B
o What is hardware multithreading? Compare and
contrast fine grained and coarse grained
multithreading.
o Discuss in detail about Instruction Level
Parallelism.
o Explain in detail Flynn’s classification of parallel
hardware.
Part B
o Explain
(i) Shared memory multiprocessor (3)
(ii)Warehouse scale computers. (7)
(iii)Message passing multiprocessors.(4)
(iv)Parallel processing challenges.(3)
(v)Clusters and Message passing system.(7)
o Describe GPU Architecture in detail.

More Related Content

Similar to CA UNIT IV.pptx

BIL406-Chapter-2-Classifications of Parallel Systems.ppt
BIL406-Chapter-2-Classifications of Parallel Systems.pptBIL406-Chapter-2-Classifications of Parallel Systems.ppt
BIL406-Chapter-2-Classifications of Parallel Systems.pptKadri20
 
Lec 2 (parallel design and programming)
Lec 2 (parallel design and programming)Lec 2 (parallel design and programming)
Lec 2 (parallel design and programming)Sudarshan Mondal
 
Lecture 1 introduction to parallel and distributed computing
Lecture 1   introduction to parallel and distributed computingLecture 1   introduction to parallel and distributed computing
Lecture 1 introduction to parallel and distributed computingVajira Thambawita
 
Lecture 2
Lecture 2Lecture 2
Lecture 2Mr SMAK
 
Multiprocessor_YChen.ppt
Multiprocessor_YChen.pptMultiprocessor_YChen.ppt
Multiprocessor_YChen.pptAberaZeleke1
 
Lecture 1 (distributed systems)
Lecture 1 (distributed systems)Lecture 1 (distributed systems)
Lecture 1 (distributed systems)Fazli Amin
 
Parallel & Distributed processing
Parallel & Distributed processingParallel & Distributed processing
Parallel & Distributed processingSyed Zaid Irshad
 
network ram parallel computing
network ram parallel computingnetwork ram parallel computing
network ram parallel computingNiranjana Ambadi
 
PARALLELISM IN MULTICORE PROCESSORS
PARALLELISM  IN MULTICORE PROCESSORSPARALLELISM  IN MULTICORE PROCESSORS
PARALLELISM IN MULTICORE PROCESSORSAmirthavalli Senthil
 
parallel processing.ppt
parallel processing.pptparallel processing.ppt
parallel processing.pptNANDHINIS109942
 
chapter-18-parallel-processing-multiprocessing (1).ppt
chapter-18-parallel-processing-multiprocessing (1).pptchapter-18-parallel-processing-multiprocessing (1).ppt
chapter-18-parallel-processing-multiprocessing (1).pptNANDHINIS109942
 
Lecture4
Lecture4Lecture4
Lecture4Asad Abbas
 
Week 13-14 Parrallel Processing-new.pptx
Week 13-14 Parrallel Processing-new.pptxWeek 13-14 Parrallel Processing-new.pptx
Week 13-14 Parrallel Processing-new.pptxFaizanSaleem81
 
Parallel computing
Parallel computingParallel computing
Parallel computingVinay Gupta
 
Parallel architecture &programming
Parallel architecture &programmingParallel architecture &programming
Parallel architecture &programmingIsmail El Gayar
 
Parallel architecture-programming
Parallel architecture-programmingParallel architecture-programming
Parallel architecture-programmingShaveta Banda
 
Operating System Overview.pdf
Operating System Overview.pdfOperating System Overview.pdf
Operating System Overview.pdfPrashantKhobragade3
 

Similar to CA UNIT IV.pptx (20)

BIL406-Chapter-2-Classifications of Parallel Systems.ppt
BIL406-Chapter-2-Classifications of Parallel Systems.pptBIL406-Chapter-2-Classifications of Parallel Systems.ppt
BIL406-Chapter-2-Classifications of Parallel Systems.ppt
 
Lec 2 (parallel design and programming)
Lec 2 (parallel design and programming)Lec 2 (parallel design and programming)
Lec 2 (parallel design and programming)
 
Lecture 1 introduction to parallel and distributed computing
Lecture 1   introduction to parallel and distributed computingLecture 1   introduction to parallel and distributed computing
Lecture 1 introduction to parallel and distributed computing
 
Lecture 2
Lecture 2Lecture 2
Lecture 2
 
Parallel processing extra
Parallel processing extraParallel processing extra
Parallel processing extra
 
Multiprocessor_YChen.ppt
Multiprocessor_YChen.pptMultiprocessor_YChen.ppt
Multiprocessor_YChen.ppt
 
Lecture 1 (distributed systems)
Lecture 1 (distributed systems)Lecture 1 (distributed systems)
Lecture 1 (distributed systems)
 
22CS201 COA
22CS201 COA22CS201 COA
22CS201 COA
 
Parallel & Distributed processing
Parallel & Distributed processingParallel & Distributed processing
Parallel & Distributed processing
 
network ram parallel computing
network ram parallel computingnetwork ram parallel computing
network ram parallel computing
 
PARALLELISM IN MULTICORE PROCESSORS
PARALLELISM  IN MULTICORE PROCESSORSPARALLELISM  IN MULTICORE PROCESSORS
PARALLELISM IN MULTICORE PROCESSORS
 
parallel processing.ppt
parallel processing.pptparallel processing.ppt
parallel processing.ppt
 
chapter-18-parallel-processing-multiprocessing (1).ppt
chapter-18-parallel-processing-multiprocessing (1).pptchapter-18-parallel-processing-multiprocessing (1).ppt
chapter-18-parallel-processing-multiprocessing (1).ppt
 
Lecture4
Lecture4Lecture4
Lecture4
 
Week 13-14 Parrallel Processing-new.pptx
Week 13-14 Parrallel Processing-new.pptxWeek 13-14 Parrallel Processing-new.pptx
Week 13-14 Parrallel Processing-new.pptx
 
Parallel computing
Parallel computingParallel computing
Parallel computing
 
Parallel architecture &programming
Parallel architecture &programmingParallel architecture &programming
Parallel architecture &programming
 
Parallel architecture-programming
Parallel architecture-programmingParallel architecture-programming
Parallel architecture-programming
 
Operating System Overview.pdf
Operating System Overview.pdfOperating System Overview.pdf
Operating System Overview.pdf
 
archintro.pdf
archintro.pdfarchintro.pdf
archintro.pdf
 

More from ssuser9dbd7e

CA UNIT V..pptx
CA UNIT V..pptxCA UNIT V..pptx
CA UNIT V..pptxssuser9dbd7e
 
CA UNIT III.pptx
CA UNIT III.pptxCA UNIT III.pptx
CA UNIT III.pptxssuser9dbd7e
 
CA UNIT II.pptx
CA UNIT II.pptxCA UNIT II.pptx
CA UNIT II.pptxssuser9dbd7e
 
CA UNIT I.pptx
CA UNIT I.pptxCA UNIT I.pptx
CA UNIT I.pptxssuser9dbd7e
 

More from ssuser9dbd7e (6)

UHV PPT.doc
UHV PPT.docUHV PPT.doc
UHV PPT.doc
 
CA UNIT V..pptx
CA UNIT V..pptxCA UNIT V..pptx
CA UNIT V..pptx
 
CA UNIT III.pptx
CA UNIT III.pptxCA UNIT III.pptx
CA UNIT III.pptx
 
CA UNIT II.pptx
CA UNIT II.pptxCA UNIT II.pptx
CA UNIT II.pptx
 
CA UNIT I.pptx
CA UNIT I.pptxCA UNIT I.pptx
CA UNIT I.pptx
 
CN PPT.docx
CN PPT.docxCN PPT.docx
CN PPT.docx
 

Recently uploaded

Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfAsst.prof M.Gokilavani
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineeringmalavadedarshan25
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLDeelipZope
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxbritheesh05
 

Recently uploaded (20)

Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineering
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCL
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptx
 

CA UNIT IV.pptx

  • 1. UNIT IV – PARALLELISIM Parallel processing challenges – Flynn„s classification – SISD, MIMD, SIMD, SPMD, and Vector Architectures - Hardware multithreading – Multi-core processors and other Shared Memory Multiprocessors - Introduction to Graphics Processing Units, Clusters, Warehouse Scale Computers and other Message-Passing Multiprocessors
  • 2. Introduction: •Processing data concurrently is known as Parallel Processing. •Consider a multiprocessor system with ‗n‘ processors. If a processor fails, the system would continue to provide service with the remaining ‗n-1‘ processors. •Parallelism is a mode of operation in which a process is split into parts, which are executed simultaneously on different processors attached to the same computer.
  • 3. •TwoWays: •Multiple functional units – two or moreALUs •Multiple processors – two or more processors •Multiprocessor System •Task-level parallelism or process-level parallelism •Parallel processing program •Cluster Multicore: • Architecture design that places multiple processors on a single die(computer chip). • Eg. Dual, Quad, Hexa, Octa. Necessity: • Reduce power consumption • Cut cost
  • 4. Goals of Parallelism: •It increases the computational speed. • It increases throughput by making two or more ALUs in CPU can work concurrently. [Throughput - amount of processing that can be accomplished during a given interval of time] •It improves the performance of the computer for a given clock speed.
  • 5. Types of Parallelism: • Instruction level parallelism •Thread level orTask level Parallelism •Bit-level Parallelism •Data level parallelism •Transaction level parallelism
  • 6. Instruction Level Parallelism •When instructions in a sequence are independent and can be executed in parallel, then there is an Instruction Level Parallelism. •Two primary methods are: 1. Increasing the depth of pipeline 2. Replicating the internal components.
  • 7. 1. Implementing a multiple issue processor - Static and Dynamic 2. Speculation - Approach to guess the properties of instruction 3. Recovery mechanisms - Exception Handling 4. Instruction issue policy - in-order issue with in-order completion - in-order issue with out-order completion - out-order issue with out-order completion 5. Register renaming 6. Branch prediction
  • 8. Parallel Processing Challenges •Challenges faced by industry is to create hardware and software that will make it easy to write correct parallel processing programs that will execute efficiently in performance and energy. •Challenges: •Writing programs •Scheduling •Partitioning the task •Balance the load between processors.
  • 9. Parallel Processing Challenges •Amdahl’s Law: Amdahl’s law is used to calculate the performance gain that can be obtained by improving some portion of a computer. 𝑆𝑝𝑒𝑒𝑑 𝑢𝑝 = 1 1 − 𝐹𝑒 + ( 𝐹𝑒 𝑆𝑒 )
  • 10. Flynn’s Classification •Flynn’s classification uses two basic concepts: 1. Parallelism in instruction stream and 2. Parallelism in data stream •There are 4 possible combinations.
  • 11. SISD (Single Instruction Single Data) •A processor that can only do one job at a time from start to finish.
  • 12. SIMD (Single Instruction Multiple Data) •They have multiple processing/execution units and one control unit. •SPMD
  • 13. MISD (Multiple Instruction Single Data) •There are N control and processor unit operating over the same data stream and result of one processor becomes input of the next processor.
  • 14. MIMD (Multiple Instruction Multiple Data) •Most of the multiprocessors system and multiple computers system come under this category. •Multiple SISD(MSISD)
  • 15. Vector Architecture •Efficient method of SIMD. •It collects data elements from memory, put them in order into a large set of register, operate them sequentially in registers and then write them results back to memory.
  • 16.
  • 17. Hardware Multithreading •The instruction stream is divided into several smaller streams called Threads. •Otherwise it’s a high degree of instruction level parallism. Some terms: •Process •Resource ownership •Scheduling /execution •Process Switch •Thread •Thread Switch
  • 18. •Two methods: 1. Explicit Multithreading 2. Implicit Multithreading
  • 19. Explicit Multithreading •Explicit Multithreading are visible to the application programs and visible to operating system. Implicit Multithreading •Implicit Multithreading are not direct method.
  • 20. Approaches to Explicit Multithreading •Single-threaded scalar •Interleaved or fine-grained multithreading •Blocked or coarse-grained multithreading •Simultaneous multithreading (SMT) •Chip multiprocessing
  • 21. •Interleaved multithreaded scalar •Blocked multithreaded scalar •Superscalar •Interleaved multithreading superscalar • Blocked multithreaded superscalar •Very long instruction word(VLIW) •Interleaved multithreading VLIW •Blocked multithreading VLIW •Simultaneous multithreading •Chip multiprocessor
  • 22.
  • 23.
  • 24.
  • 25. Multicore Processors and Other Shared Memory Multiprocessors •Multicore architecture are classified into 3 types: 1. Type 1 (Hyperthreading technology) 2. Type 2 (Classic Multiprocessor) 3. Type 3 (Multicore system)
  • 26.
  • 27.
  • 28. Shared Memory Multiprocessor (SMP) •SMP is one that offers the programmer a single physical address space across all processor. •Classified as: 1. Uniform memory access multiprocessor (UMA) 2. Non-Uniform memory access multiprocessor (NUMA)
  • 29.
  • 30. S.No. Key UMA NUMA 1 Definition UMA stands for Uniform Memory Access. NUMA stands for Non Uniform Memory Access. 2 Memory Controller UMA has single memory controller. NUMA has multiple memory controllers. 3 Memory Access UMA memory access is slow. NUMA memory accsss is faster than UMA memory. 4 Bandwidth UMA has limited bandwidth. NUMA has more bandwidth than UMA. 5 Suitability UMA is used in general purpose and time sharing applications. NUMA is used in real time and time critical applications. 6 Memory Access time UMA has equal memory access time. NUMA has varying memory access time. 7 Bus types 3 types of Buses supported: Single, Multiple and Crossbar. 2 types of Buses supported: Tree, hiearchical.
  • 31. Graphics Processing Unit (GPU) 1. GPUs vs CPUs •Programming interface to GPU are high-level application programming interface (APIs) such as DirectX, OpenGL, NVIDIA’s C for graphics etc.. •CPU supports sequential coding while GPU supports parallel coding.
  • 32. Graphics Processing Unit (GPU) 1. GPUs vs CPUs •Programming interface to GPU are high-level application programming interface (APIs) such as DirectX, OpenGL, NVIDIA’s C for graphics etc.. •CPU supports sequential coding while GPU supports parallel coding.
  • 33. 2. Connection between CPU and GPU
  • 34. 3. GPU Architecture o SIMD One instruction operates on multiple data. o Multithreading Most graphics have this property since they need to process many objects. (pixels, vertices, polygons) simultaneously. o NIVIDIA GPU architecture 1. Motherboard GPUs integrated 2. Tesla-based GPUs – 900MHz, 128MB – DDR3 RAM
  • 35.
  • 36. CUDA Programming o Compute Unified Device Architecture o CUDA is a parallel computing platform and programming model developed by Nvidia for general computing on its own GPUs (graphics processing units). o CUDA enables developers to speed up compute- intensive applications by harnessing the power of GPUs for the parallelizable part of the computation. o Heterogeneous CPU and GPU System.
  • 37. Message-passing multiprocessors o With no shared memory space, the alternative method to achieve multiprocessor is via explicit message passing technique. o This is done by establishing a communication channel between two processor.
  • 38.
  • 39. Shared memory multiprocessor o A Shared memory multiprocessor is a computer system composed of multiple independent processors that execute different instruction streams. o Processor share a common memory address space and communicate with each other via memory.
  • 40. Clusters o Clusters are collections of desktop computers or servers connected by local area networks to act as a single large computer.
  • 41. Warehouse-Scale Computers o Largest form of clusters are called Warehouse- scale computers (WSCs) o WSC provide internet services: 1. Google 2. Facebook 3. Youtube 4. Amazon
  • 42. Goals and requirements with servers: •Cost-performance •Energy efficiency •Dependability •Network i/o •Interactive Characteristics not shared with servers: •Ample parallelism •Operational cost count •Scale
  • 43. Ques o List four major groups of computes defined by Michael J.Flynn o State amdahl’s law. o Define Parallel processing. o What is Speculation? o State Coarse grained multithreading. o Write note on SIMD processor. o Define VLIW. o Compare UMA and NUMA multiprocessor. o What is multicore processor?
  • 44. Part B o What is hardware multithreading? Compare and contrast fine grained and coarse grained multithreading. o Discuss in detail about Instruction Level Parallelism. o Explain in detail Flynn’s classification of parallel hardware.
  • 45. Part B o Explain (i) Shared memory multiprocessor (3) (ii)Warehouse scale computers. (7) (iii)Message passing multiprocessors.(4) (iv)Parallel processing challenges.(3) (v)Clusters and Message passing system.(7) o Describe GPU Architecture in detail.