SlideShare a Scribd company logo
Distributed Memory Architecture
MS(CS) - I
Hafsa Habib
Syeda Haseeba Khanam
Amber Azhar
Zainab Khalid
Lahore College for Women University
Department of Computer Science
Content
● MIMD processor classification
● Distributed MIMD architecture
○ Basic difference between DM-MIMD and SM-MIMD
● Communication Techniques of DM-MIMD
● Major classification of DM-MIMD
○ NUMA
○ MPP
○ Cluster
● Pros and Cons of DM-MIMD over SM-MIMD architecture
○ Scalability
○ Issues in scalability
2
MIMD Architecture: Classification
3
Non-Shared MIMD Architecture (DM-MIMD)
Non Shared MIMD Architecture
● Also called Distributed Memory MIMD or Message Passing MIMD
Computers or Loosely coupled MIMD
● Processors have their own memory local memory
○ Memory address for one processor does not map on other processors
○ No concept of global address space
● Each processor operates independently because of its own local memory
○ Changes in one processor’s local memory has no effect on other
processor’s local memory
○ Therefor cache synchronization and cache coherency does not apply.
● Inter Process Communication is done by Message Passing.
5
DM-MIMD vs SM-MIMD
6
DM-MIMD vs SM-MIMD
DM-MIMD
● Private physical address space
for each processor
● Data must be explicitly assigned
to the private address space
● Communication/synchronization
via network by Message Passing
● Concept of cache coherency
does not apply because no global
address space
SM-MIMD
● Global address space shared by
all
● Data is implicitly assigned to the
address space.
● Cooperate by reading/writing
same shared variable
● Communication through BUS
● Concept of cache coherency
applies due to shared Global
address space 7
Content
● MIMD processor classification
● Distributed MIMD architecture
○ Basic difference between DM-MIMD and SM-MIMD
● Communication Technique
● Major classification of DM-MIMD
○ NUMA
○ MPP
○ Cluster
● Pros and Cons of DM-MIMD over SM-MIMD architecture
○ Scalability
○ Issues in scalability
8
Communication Technique
Communication in DM
Architecture
● Require a communication
NETWORK to connect inter
processor memory.
● Communication and
Synchronization is done through
Message Passing Model.
● Processor share data by explicitly
send and receive information.
● Coordination is built into message
passing primitives
○ message SEND and message
RECEIVE
10
Why DM-
Architecture use Message
Passing Model?
In Distributed memory architecture there is no
global memory so it is necessary to move data
from one local memory to another by means of
message passing.
11
Message Passing Model
● Communication via
Send/Receive
○ Through Interconnection
Network
● Data is packed into larger
packets
● Send sends message to
another destination processor
● Receive indicates that a
processor is ready to receive a
message; message from
another source processor
12
Message Passing Model (cont’d)
● When a process interacts with another, two requirements have to be satisfied.
○ Synchronization and Communication
● Synchronization in message passing model is either asynchronous or
synchronous
○ If Asynchronous , it means no acknowledgement is required at both
ends(receiver and sender)
■ Sender and receiver don’t wait for each other and can carry on their
own computations while transfer of messages is being done.
○ If synchronous, Acknowledgement is required.
■ Both processors have to wait for each other while transferring the
message. (one blocks until the second is ready)
13
14
Pros and Cons of Message Passing Model
● The advantage for programmers is that
communication is explicit, so there are
fewer “performance surprises” than with
the implicit communication in cache-
coherent SMPs.
● Synchronization is naturally associated
with sending messages, reducing the
possibility for errors introduced by
incorrect synchronization
● Much easier for hardware designers to
design
● Message sending and receiving is much
slower
● It's harder to port a sequential program
to a message passing multiprocessor .
Pros Cons
15
Communication in DM Architecture
Vs
Communication in SM Architecture
Sr.No. Difference Distributed Memory
Architecture
Shared Memory
Architecture
1. Explicit Communication/Implicit
Communication
Explicit via Messages Implicit via Memory
Operations
2. Who is Responsible for carrying
communication task?
Programmer is
responsible to send
and receive data
Sending and receiving
is automatic.
System is Responsible
for setting data in
cache. Programmer
just load from
memory and store to
memory.
3. Synchronization Automatic Can be Achieved using
different mechanism
4. Protocols Fully under
programmer control
Hidden within the
system
17
Content
● MIMD processor classification
● Distributed MIMD architecture
○ Basic difference between DM-MIMD and SM-MIMD
● Communication Techniques of DM-MIMD
● Major classification of DM-MIMD
○ NUMA
○ MPP
○ Cluster
● Pros and Cons of DM-MIMD over SM-MIMD architecture
○ Scalability
○ Issues in scalability
18
Classification of Distributed Memory Architecture
Types of Distributed Memory Architecture
DM-MIMD
Architecture
NUMAClusters MPP
20
NUMA (Non-Uniform memory Access)
21
NUMA (Non-Uniform memory Access)
● NUMA is a computer memory design
used in multiprocessing, where the
memory access time depends on the
memory location relative to the
processor.
● Under NUMA, a processor can access its
own local memory faster than non-local
memory (memory local to another
processor or memory shared between
processors).
● The benefits of NUMA are limited to
particular workloads, notably on servers
where the data is often associated
strongly with certain tasks or users,
● There are two morals to this performance
story.
● The first is that even a single 32-bit , but
already commonplace, processor is
starting to push the limits of standard
memory performance.
● The second is that even conventional
memory types differences play a role in
overall system performance. So it should
come as no surprise that NUMA support
is now in server operating systems. e.g
Microsoft’s Windows Server 2003 and in
Linux 2.6 kernel.
22
MPP (Massively Parallel Processor)
23
MPP (Massively Parallel Processor)
24
MPP (Massively Parallel Processor Architecture)
25
What is a Cluster
● Network of independent computers
○ Each has private memory and OS
○ Connected using I/O system
E.g., Ethernet/switch, Internet
● Independent Computers in a cluster are called Node
○ Master and computing Nodes
● Cluster Middleware is required
○ Message Passing Interface
● Node management is to be considered
● Appear as a single system to user
26
Clusters
Clusters split problem in smaller tasks
that are executed concurrently
Why?
● Absolute physical limits of
hardware components
● Economical reasons – more
complex = more expensive
● Performance limits – double
frequency <> double performance
● Large applications – demand too
much memory & time
Advantages:
Increasing speed & optimizing
resources utilization.greatly
independent of hardware
Disadvantages:
Complex programming models –
difficult development
Applications
Suitable for applications with
independent tasks
SuperComputers ,Web
servers, databases, simulations,
27
Clusters
28
Clusters vs MPP
Similar to MPPs
● Commodity processor and memory
○ Processor performance must be maximized
● Memory Hierarchy includes remote memory
○ Non Uniform Memory Access
● No shared memory - message passing
29
Clusters vs MPPs
Clusters
● In a cluster, each machine is largely
independent of the others in terms
of memory, disk, etc.
● They are interconnected using
some variation on normal
networking.
● The cluster exists mostly in the
mind of the programmer and how
s/he chooses to distribute the
work.
● Best to use in servers with multiple
independent tasks.
MPPs
● In a Massively Parallel Processor,
there really is only one machine
with thousands of CPUs tightly
● Interconnected with I/O
subsystem.
● MPPs have exotic memory
architectures to allow extremely
high speed exchange of
intermediate results with
neighboring processors.
● MPPs are of use only on algorithms
that are embarrassingly parallel . 30
Content
● MIMD processor classification
● Distributed MIMD architecture
○ Basic difference between DM-MIMD and SM-MIMD
● Communication Technique
● Major classification of DM-MIMD
○ NUMA
○ MPP
○ Cluster
● Pros and Cons of DM-MIMD over SM-MIMD architecture
○ Issues in DM Architecture
○ Scalability
31
Pros & Cons of DM-MIMD over SM-MIMD
Pros of DM-MIMD over SM-MIMD
DM-MIMD
● Memory is scalable with the
number of processors.Increase the
number of processors and the size
of memory increases
proportionately.
● Each processor can rapidly access
its own memory without
interference and without overhead
with trying to maintain global cache
concurrency.
● Cost effectiveness: can use
commodity, off-the-shelf
processors and networking.
SM-MIMD
● Lack of scalability between
memory and CPUs: Adding more
CPUs can geometrically increases
traffic on the shared memory CPU
path,and geometrically increase
traffic associated with cache
memory management.
● Expense:it becomes increasingly
difficult and expensive to design
and produce shared memory
machines with ever increasing
number of processors.
33
Cons of DM-MIMD over SM-MIMD
DM-MIMD
● Non uniform memory access
times-data residing on a
remote node takes longer to
access than local data.
● The programmer is
responsible for many of the
details associated with data
communication processors .
SM-MIMD
● Data sharing between tasks
is both fast and uniform due
to the proximity of memory
to CPUs.
● Global address space
provides a user-friendly
programming perspective to
memory.
34
Issues of DM Architecture
Latency and Bandwidth for accessing distributed memory is the main memory
performance issues:
● Efficiency in parallel processing is usually related to ratio of time for calculation vs
time for communication, the higher the ratio the higher the performance.
● Problem is more even severe when access to distributed memory is needed,since
there is an extra level in the memory hierarchy,with latency and bandwidth that can
be very slower than local memory access.
35
Scalability and its issues
A scalable architecture is an architecture that can scale up to meet increased work loads. In other
words, if the workload all of a sudden exceeds the capacity of your existing software + hardware
combination, you can scale up the system (software + hardware) to meet the increased workload.
Scalability to more processor is the key issue
● Access times to “distinct processors should not be very much slower than
access to “nearby” processors since non-local and collective (all-to-all)
communication is important for many programs.This can be a problem for
large parallel computers (hundreds or thousands of processors). Many
different approaches to network topology and switching have been tried in
attempting to alleviate this program.
36
We Welcome your Questions , Suggestions and Comments!
37

More Related Content

What's hot

Shared-Memory Multiprocessors
Shared-Memory MultiprocessorsShared-Memory Multiprocessors
Shared-Memory Multiprocessors
Salvatore La Bua
 
Centralized shared memory architectures
Centralized shared memory architecturesCentralized shared memory architectures
Centralized shared memory architectures
Gokuldhev mony
 
Distributed System ppt
Distributed System pptDistributed System ppt
Underlying principles of parallel and distributed computing
Underlying principles of parallel and distributed computingUnderlying principles of parallel and distributed computing
Underlying principles of parallel and distributed computing
GOVERNMENT COLLEGE OF ENGINEERING,TIRUNELVELI
 
Limitations of memory system performance
Limitations of memory system performanceLimitations of memory system performance
Limitations of memory system performance
Syed Zaid Irshad
 
Parallel programming using MPI
Parallel programming using MPIParallel programming using MPI
Parallel programming using MPI
Ajit Nayak
 
Lecture 1 introduction to parallel and distributed computing
Lecture 1   introduction to parallel and distributed computingLecture 1   introduction to parallel and distributed computing
Lecture 1 introduction to parallel and distributed computing
Vajira Thambawita
 
Parallelism
ParallelismParallelism
Parallelism
Md Raseduzzaman
 
Parallel computing
Parallel computingParallel computing
Parallel computing
Vinay Gupta
 
File models and file accessing models
File models and file accessing modelsFile models and file accessing models
File models and file accessing models
ishmecse13
 
distributed Computing system model
distributed Computing system modeldistributed Computing system model
distributed Computing system model
Harshad Umredkar
 
Multiprocessor
MultiprocessorMultiprocessor
Multiprocessor
Neel Patel
 
Parallel Processing Concepts
Parallel Processing Concepts Parallel Processing Concepts
Parallel Processing Concepts
Dr Shashikant Athawale
 
Cache coherence
Cache coherenceCache coherence
Cache coherence
Employee
 
6.distributed shared memory
6.distributed shared memory6.distributed shared memory
6.distributed shared memory
Gd Goenka University
 
Memory Hierarchy
Memory HierarchyMemory Hierarchy
Memory Hierarchy
chauhankapil
 
Multivector and multiprocessor
Multivector and multiprocessorMultivector and multiprocessor
Multivector and multiprocessor
Kishan Panara
 
Data Parallel and Object Oriented Model
Data Parallel and Object Oriented ModelData Parallel and Object Oriented Model
Data Parallel and Object Oriented Model
Nikhil Sharma
 
Replication in Distributed Systems
Replication in Distributed SystemsReplication in Distributed Systems
Replication in Distributed Systems
Kavya Barnadhya Hazarika
 
Operating system 31 multiple processor scheduling
Operating system 31 multiple processor schedulingOperating system 31 multiple processor scheduling
Operating system 31 multiple processor scheduling
Vaibhav Khanna
 

What's hot (20)

Shared-Memory Multiprocessors
Shared-Memory MultiprocessorsShared-Memory Multiprocessors
Shared-Memory Multiprocessors
 
Centralized shared memory architectures
Centralized shared memory architecturesCentralized shared memory architectures
Centralized shared memory architectures
 
Distributed System ppt
Distributed System pptDistributed System ppt
Distributed System ppt
 
Underlying principles of parallel and distributed computing
Underlying principles of parallel and distributed computingUnderlying principles of parallel and distributed computing
Underlying principles of parallel and distributed computing
 
Limitations of memory system performance
Limitations of memory system performanceLimitations of memory system performance
Limitations of memory system performance
 
Parallel programming using MPI
Parallel programming using MPIParallel programming using MPI
Parallel programming using MPI
 
Lecture 1 introduction to parallel and distributed computing
Lecture 1   introduction to parallel and distributed computingLecture 1   introduction to parallel and distributed computing
Lecture 1 introduction to parallel and distributed computing
 
Parallelism
ParallelismParallelism
Parallelism
 
Parallel computing
Parallel computingParallel computing
Parallel computing
 
File models and file accessing models
File models and file accessing modelsFile models and file accessing models
File models and file accessing models
 
distributed Computing system model
distributed Computing system modeldistributed Computing system model
distributed Computing system model
 
Multiprocessor
MultiprocessorMultiprocessor
Multiprocessor
 
Parallel Processing Concepts
Parallel Processing Concepts Parallel Processing Concepts
Parallel Processing Concepts
 
Cache coherence
Cache coherenceCache coherence
Cache coherence
 
6.distributed shared memory
6.distributed shared memory6.distributed shared memory
6.distributed shared memory
 
Memory Hierarchy
Memory HierarchyMemory Hierarchy
Memory Hierarchy
 
Multivector and multiprocessor
Multivector and multiprocessorMultivector and multiprocessor
Multivector and multiprocessor
 
Data Parallel and Object Oriented Model
Data Parallel and Object Oriented ModelData Parallel and Object Oriented Model
Data Parallel and Object Oriented Model
 
Replication in Distributed Systems
Replication in Distributed SystemsReplication in Distributed Systems
Replication in Distributed Systems
 
Operating system 31 multiple processor scheduling
Operating system 31 multiple processor schedulingOperating system 31 multiple processor scheduling
Operating system 31 multiple processor scheduling
 

Similar to distributed memory architecture/ Non Shared MIMD Architecture

COA-Unit4-PPT.pptx
COA-Unit4-PPT.pptxCOA-Unit4-PPT.pptx
COA-Unit4-PPT.pptx
Ruhul Amin
 
High Performance Computer Architecture
High Performance Computer ArchitectureHigh Performance Computer Architecture
High Performance Computer Architecture
Subhasis Dash
 
Computer system Architecture. This PPT is based on computer system
Computer system Architecture. This PPT is based on computer systemComputer system Architecture. This PPT is based on computer system
Computer system Architecture. This PPT is based on computer system
mohantysikun0
 
Term paper of cse(211) avdhesh sharma c1801 a24 regd 10802037
Term paper of cse(211) avdhesh sharma c1801 a24 regd 10802037Term paper of cse(211) avdhesh sharma c1801 a24 regd 10802037
Term paper of cse(211) avdhesh sharma c1801 a24 regd 10802037
Upendra Sengar
 
System on chip architectures
System on chip architecturesSystem on chip architectures
System on chip architectures
A B Shinde
 
Multicore architectures
Multicore architecturesMulticore architectures
Multicore architectures
Muhammet SOYTÜRK
 
Multiprocessor.pptx
 Multiprocessor.pptx Multiprocessor.pptx
Multiprocessor.pptx
Muhammad54342
 
PGAS Programming Model
PGAS Programming ModelPGAS Programming Model
PGAS Programming Model
ch adnan
 
Distributed system lectures
Distributed system lecturesDistributed system lectures
Distributed system lectures
marwaeng
 
CA UNIT IV.pptx
CA UNIT IV.pptxCA UNIT IV.pptx
CA UNIT IV.pptx
ssuser9dbd7e
 
intro, definitions, basic laws+.pptx
intro, definitions, basic laws+.pptxintro, definitions, basic laws+.pptx
intro, definitions, basic laws+.pptx
ssuser413a98
 
PARALLELISM IN MULTICORE PROCESSORS
PARALLELISM  IN MULTICORE PROCESSORSPARALLELISM  IN MULTICORE PROCESSORS
PARALLELISM IN MULTICORE PROCESSORS
Amirthavalli Senthil
 
Software Design Practices for Large-Scale Automation
Software Design Practices for Large-Scale AutomationSoftware Design Practices for Large-Scale Automation
Software Design Practices for Large-Scale Automation
Hao Xu
 
Advanced computer architecture
Advanced computer architectureAdvanced computer architecture
Advanced computer architecture
krishnaviswambharan
 
Cloud Computing-UNIT 1 claud computing basics
Cloud Computing-UNIT 1 claud computing basicsCloud Computing-UNIT 1 claud computing basics
Cloud Computing-UNIT 1 claud computing basics
moeincanada007
 
Dichotomy of parallel computing platforms
Dichotomy of parallel computing platformsDichotomy of parallel computing platforms
Dichotomy of parallel computing platforms
Syed Zaid Irshad
 
Ceg4131 models
Ceg4131 modelsCeg4131 models
Ceg4131 models
anandme07
 
Grid computing
Grid computingGrid computing
Grid computing
Shashwat Shriparv
 
Distributed Computing
Distributed ComputingDistributed Computing
Distributed Computing
Sudarsun Santhiappan
 
Advance computer architecture
Advance computer architecture Advance computer architecture
Advance computer architecture
SabthamiS1
 

Similar to distributed memory architecture/ Non Shared MIMD Architecture (20)

COA-Unit4-PPT.pptx
COA-Unit4-PPT.pptxCOA-Unit4-PPT.pptx
COA-Unit4-PPT.pptx
 
High Performance Computer Architecture
High Performance Computer ArchitectureHigh Performance Computer Architecture
High Performance Computer Architecture
 
Computer system Architecture. This PPT is based on computer system
Computer system Architecture. This PPT is based on computer systemComputer system Architecture. This PPT is based on computer system
Computer system Architecture. This PPT is based on computer system
 
Term paper of cse(211) avdhesh sharma c1801 a24 regd 10802037
Term paper of cse(211) avdhesh sharma c1801 a24 regd 10802037Term paper of cse(211) avdhesh sharma c1801 a24 regd 10802037
Term paper of cse(211) avdhesh sharma c1801 a24 regd 10802037
 
System on chip architectures
System on chip architecturesSystem on chip architectures
System on chip architectures
 
Multicore architectures
Multicore architecturesMulticore architectures
Multicore architectures
 
Multiprocessor.pptx
 Multiprocessor.pptx Multiprocessor.pptx
Multiprocessor.pptx
 
PGAS Programming Model
PGAS Programming ModelPGAS Programming Model
PGAS Programming Model
 
Distributed system lectures
Distributed system lecturesDistributed system lectures
Distributed system lectures
 
CA UNIT IV.pptx
CA UNIT IV.pptxCA UNIT IV.pptx
CA UNIT IV.pptx
 
intro, definitions, basic laws+.pptx
intro, definitions, basic laws+.pptxintro, definitions, basic laws+.pptx
intro, definitions, basic laws+.pptx
 
PARALLELISM IN MULTICORE PROCESSORS
PARALLELISM  IN MULTICORE PROCESSORSPARALLELISM  IN MULTICORE PROCESSORS
PARALLELISM IN MULTICORE PROCESSORS
 
Software Design Practices for Large-Scale Automation
Software Design Practices for Large-Scale AutomationSoftware Design Practices for Large-Scale Automation
Software Design Practices for Large-Scale Automation
 
Advanced computer architecture
Advanced computer architectureAdvanced computer architecture
Advanced computer architecture
 
Cloud Computing-UNIT 1 claud computing basics
Cloud Computing-UNIT 1 claud computing basicsCloud Computing-UNIT 1 claud computing basics
Cloud Computing-UNIT 1 claud computing basics
 
Dichotomy of parallel computing platforms
Dichotomy of parallel computing platformsDichotomy of parallel computing platforms
Dichotomy of parallel computing platforms
 
Ceg4131 models
Ceg4131 modelsCeg4131 models
Ceg4131 models
 
Grid computing
Grid computingGrid computing
Grid computing
 
Distributed Computing
Distributed ComputingDistributed Computing
Distributed Computing
 
Advance computer architecture
Advance computer architecture Advance computer architecture
Advance computer architecture
 

Recently uploaded

2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
Yasser Mahgoub
 
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsKuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
Victor Morales
 
Manufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptxManufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptx
Madan Karki
 
Data Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason WebinarData Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason Webinar
UReason
 
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Sinan KOZAK
 
artificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptxartificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptx
GauravCar
 
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTCHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
jpsjournal1
 
AI assisted telemedicine KIOSK for Rural India.pptx
AI assisted telemedicine KIOSK for Rural India.pptxAI assisted telemedicine KIOSK for Rural India.pptx
AI assisted telemedicine KIOSK for Rural India.pptx
architagupta876
 
Hematology Analyzer Machine - Complete Blood Count
Hematology Analyzer Machine - Complete Blood CountHematology Analyzer Machine - Complete Blood Count
Hematology Analyzer Machine - Complete Blood Count
shahdabdulbaset
 
Material for memory and display system h
Material for memory and display system hMaterial for memory and display system h
Material for memory and display system h
gowrishankartb2005
 
Properties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptxProperties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptx
MDSABBIROJJAMANPAYEL
 
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
IJECEIAES
 
CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1
PKavitha10
 
Seminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptxSeminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptx
Madan Karki
 
Mechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdfMechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdf
21UME003TUSHARDEB
 
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student MemberIEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
VICTOR MAESTRE RAMIREZ
 
BRAIN TUMOR DETECTION for seminar ppt.pdf
BRAIN TUMOR DETECTION for seminar ppt.pdfBRAIN TUMOR DETECTION for seminar ppt.pdf
BRAIN TUMOR DETECTION for seminar ppt.pdf
LAXMAREDDY22
 
Null Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAMNull Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAM
Divyanshu
 
Curve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods RegressionCurve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods Regression
Nada Hikmah
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
IJECEIAES
 

Recently uploaded (20)

2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
 
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsKuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
 
Manufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptxManufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptx
 
Data Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason WebinarData Driven Maintenance | UReason Webinar
Data Driven Maintenance | UReason Webinar
 
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
 
artificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptxartificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptx
 
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTCHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
 
AI assisted telemedicine KIOSK for Rural India.pptx
AI assisted telemedicine KIOSK for Rural India.pptxAI assisted telemedicine KIOSK for Rural India.pptx
AI assisted telemedicine KIOSK for Rural India.pptx
 
Hematology Analyzer Machine - Complete Blood Count
Hematology Analyzer Machine - Complete Blood CountHematology Analyzer Machine - Complete Blood Count
Hematology Analyzer Machine - Complete Blood Count
 
Material for memory and display system h
Material for memory and display system hMaterial for memory and display system h
Material for memory and display system h
 
Properties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptxProperties Railway Sleepers and Test.pptx
Properties Railway Sleepers and Test.pptx
 
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
 
CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1
 
Seminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptxSeminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptx
 
Mechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdfMechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdf
 
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student MemberIEEE Aerospace and Electronic Systems Society as a Graduate Student Member
IEEE Aerospace and Electronic Systems Society as a Graduate Student Member
 
BRAIN TUMOR DETECTION for seminar ppt.pdf
BRAIN TUMOR DETECTION for seminar ppt.pdfBRAIN TUMOR DETECTION for seminar ppt.pdf
BRAIN TUMOR DETECTION for seminar ppt.pdf
 
Null Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAMNull Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAM
 
Curve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods RegressionCurve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods Regression
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
 

distributed memory architecture/ Non Shared MIMD Architecture

  • 1. Distributed Memory Architecture MS(CS) - I Hafsa Habib Syeda Haseeba Khanam Amber Azhar Zainab Khalid Lahore College for Women University Department of Computer Science
  • 2. Content ● MIMD processor classification ● Distributed MIMD architecture ○ Basic difference between DM-MIMD and SM-MIMD ● Communication Techniques of DM-MIMD ● Major classification of DM-MIMD ○ NUMA ○ MPP ○ Cluster ● Pros and Cons of DM-MIMD over SM-MIMD architecture ○ Scalability ○ Issues in scalability 2
  • 5. Non Shared MIMD Architecture ● Also called Distributed Memory MIMD or Message Passing MIMD Computers or Loosely coupled MIMD ● Processors have their own memory local memory ○ Memory address for one processor does not map on other processors ○ No concept of global address space ● Each processor operates independently because of its own local memory ○ Changes in one processor’s local memory has no effect on other processor’s local memory ○ Therefor cache synchronization and cache coherency does not apply. ● Inter Process Communication is done by Message Passing. 5
  • 7. DM-MIMD vs SM-MIMD DM-MIMD ● Private physical address space for each processor ● Data must be explicitly assigned to the private address space ● Communication/synchronization via network by Message Passing ● Concept of cache coherency does not apply because no global address space SM-MIMD ● Global address space shared by all ● Data is implicitly assigned to the address space. ● Cooperate by reading/writing same shared variable ● Communication through BUS ● Concept of cache coherency applies due to shared Global address space 7
  • 8. Content ● MIMD processor classification ● Distributed MIMD architecture ○ Basic difference between DM-MIMD and SM-MIMD ● Communication Technique ● Major classification of DM-MIMD ○ NUMA ○ MPP ○ Cluster ● Pros and Cons of DM-MIMD over SM-MIMD architecture ○ Scalability ○ Issues in scalability 8
  • 10. Communication in DM Architecture ● Require a communication NETWORK to connect inter processor memory. ● Communication and Synchronization is done through Message Passing Model. ● Processor share data by explicitly send and receive information. ● Coordination is built into message passing primitives ○ message SEND and message RECEIVE 10
  • 11. Why DM- Architecture use Message Passing Model? In Distributed memory architecture there is no global memory so it is necessary to move data from one local memory to another by means of message passing. 11
  • 12. Message Passing Model ● Communication via Send/Receive ○ Through Interconnection Network ● Data is packed into larger packets ● Send sends message to another destination processor ● Receive indicates that a processor is ready to receive a message; message from another source processor 12
  • 13. Message Passing Model (cont’d) ● When a process interacts with another, two requirements have to be satisfied. ○ Synchronization and Communication ● Synchronization in message passing model is either asynchronous or synchronous ○ If Asynchronous , it means no acknowledgement is required at both ends(receiver and sender) ■ Sender and receiver don’t wait for each other and can carry on their own computations while transfer of messages is being done. ○ If synchronous, Acknowledgement is required. ■ Both processors have to wait for each other while transferring the message. (one blocks until the second is ready) 13
  • 14. 14
  • 15. Pros and Cons of Message Passing Model ● The advantage for programmers is that communication is explicit, so there are fewer “performance surprises” than with the implicit communication in cache- coherent SMPs. ● Synchronization is naturally associated with sending messages, reducing the possibility for errors introduced by incorrect synchronization ● Much easier for hardware designers to design ● Message sending and receiving is much slower ● It's harder to port a sequential program to a message passing multiprocessor . Pros Cons 15
  • 16. Communication in DM Architecture Vs Communication in SM Architecture
  • 17. Sr.No. Difference Distributed Memory Architecture Shared Memory Architecture 1. Explicit Communication/Implicit Communication Explicit via Messages Implicit via Memory Operations 2. Who is Responsible for carrying communication task? Programmer is responsible to send and receive data Sending and receiving is automatic. System is Responsible for setting data in cache. Programmer just load from memory and store to memory. 3. Synchronization Automatic Can be Achieved using different mechanism 4. Protocols Fully under programmer control Hidden within the system 17
  • 18. Content ● MIMD processor classification ● Distributed MIMD architecture ○ Basic difference between DM-MIMD and SM-MIMD ● Communication Techniques of DM-MIMD ● Major classification of DM-MIMD ○ NUMA ○ MPP ○ Cluster ● Pros and Cons of DM-MIMD over SM-MIMD architecture ○ Scalability ○ Issues in scalability 18
  • 19. Classification of Distributed Memory Architecture
  • 20. Types of Distributed Memory Architecture DM-MIMD Architecture NUMAClusters MPP 20
  • 22. NUMA (Non-Uniform memory Access) ● NUMA is a computer memory design used in multiprocessing, where the memory access time depends on the memory location relative to the processor. ● Under NUMA, a processor can access its own local memory faster than non-local memory (memory local to another processor or memory shared between processors). ● The benefits of NUMA are limited to particular workloads, notably on servers where the data is often associated strongly with certain tasks or users, ● There are two morals to this performance story. ● The first is that even a single 32-bit , but already commonplace, processor is starting to push the limits of standard memory performance. ● The second is that even conventional memory types differences play a role in overall system performance. So it should come as no surprise that NUMA support is now in server operating systems. e.g Microsoft’s Windows Server 2003 and in Linux 2.6 kernel. 22
  • 23. MPP (Massively Parallel Processor) 23
  • 24. MPP (Massively Parallel Processor) 24
  • 25. MPP (Massively Parallel Processor Architecture) 25
  • 26. What is a Cluster ● Network of independent computers ○ Each has private memory and OS ○ Connected using I/O system E.g., Ethernet/switch, Internet ● Independent Computers in a cluster are called Node ○ Master and computing Nodes ● Cluster Middleware is required ○ Message Passing Interface ● Node management is to be considered ● Appear as a single system to user 26
  • 27. Clusters Clusters split problem in smaller tasks that are executed concurrently Why? ● Absolute physical limits of hardware components ● Economical reasons – more complex = more expensive ● Performance limits – double frequency <> double performance ● Large applications – demand too much memory & time Advantages: Increasing speed & optimizing resources utilization.greatly independent of hardware Disadvantages: Complex programming models – difficult development Applications Suitable for applications with independent tasks SuperComputers ,Web servers, databases, simulations, 27
  • 29. Clusters vs MPP Similar to MPPs ● Commodity processor and memory ○ Processor performance must be maximized ● Memory Hierarchy includes remote memory ○ Non Uniform Memory Access ● No shared memory - message passing 29
  • 30. Clusters vs MPPs Clusters ● In a cluster, each machine is largely independent of the others in terms of memory, disk, etc. ● They are interconnected using some variation on normal networking. ● The cluster exists mostly in the mind of the programmer and how s/he chooses to distribute the work. ● Best to use in servers with multiple independent tasks. MPPs ● In a Massively Parallel Processor, there really is only one machine with thousands of CPUs tightly ● Interconnected with I/O subsystem. ● MPPs have exotic memory architectures to allow extremely high speed exchange of intermediate results with neighboring processors. ● MPPs are of use only on algorithms that are embarrassingly parallel . 30
  • 31. Content ● MIMD processor classification ● Distributed MIMD architecture ○ Basic difference between DM-MIMD and SM-MIMD ● Communication Technique ● Major classification of DM-MIMD ○ NUMA ○ MPP ○ Cluster ● Pros and Cons of DM-MIMD over SM-MIMD architecture ○ Issues in DM Architecture ○ Scalability 31
  • 32. Pros & Cons of DM-MIMD over SM-MIMD
  • 33. Pros of DM-MIMD over SM-MIMD DM-MIMD ● Memory is scalable with the number of processors.Increase the number of processors and the size of memory increases proportionately. ● Each processor can rapidly access its own memory without interference and without overhead with trying to maintain global cache concurrency. ● Cost effectiveness: can use commodity, off-the-shelf processors and networking. SM-MIMD ● Lack of scalability between memory and CPUs: Adding more CPUs can geometrically increases traffic on the shared memory CPU path,and geometrically increase traffic associated with cache memory management. ● Expense:it becomes increasingly difficult and expensive to design and produce shared memory machines with ever increasing number of processors. 33
  • 34. Cons of DM-MIMD over SM-MIMD DM-MIMD ● Non uniform memory access times-data residing on a remote node takes longer to access than local data. ● The programmer is responsible for many of the details associated with data communication processors . SM-MIMD ● Data sharing between tasks is both fast and uniform due to the proximity of memory to CPUs. ● Global address space provides a user-friendly programming perspective to memory. 34
  • 35. Issues of DM Architecture Latency and Bandwidth for accessing distributed memory is the main memory performance issues: ● Efficiency in parallel processing is usually related to ratio of time for calculation vs time for communication, the higher the ratio the higher the performance. ● Problem is more even severe when access to distributed memory is needed,since there is an extra level in the memory hierarchy,with latency and bandwidth that can be very slower than local memory access. 35
  • 36. Scalability and its issues A scalable architecture is an architecture that can scale up to meet increased work loads. In other words, if the workload all of a sudden exceeds the capacity of your existing software + hardware combination, you can scale up the system (software + hardware) to meet the increased workload. Scalability to more processor is the key issue ● Access times to “distinct processors should not be very much slower than access to “nearby” processors since non-local and collective (all-to-all) communication is important for many programs.This can be a problem for large parallel computers (hundreds or thousands of processors). Many different approaches to network topology and switching have been tried in attempting to alleviate this program. 36
  • 37. We Welcome your Questions , Suggestions and Comments! 37

Editor's Notes

  1. http://slideplayer.com/slide/8893733/
  2. DM: Protocols are complex to programmer causing communication to be treated as an I/O call. SM: Communication can be close to hardware because of shared bus system and if we modify our shared memory’s hardware then communication will be fast
  3. http://www.brainkart.com/article/Computer-Clusters-and-MPP-Architectures_11316/
  4. Commodity computing involves the use of large numbers of already-available computing components for parallel computing, to get the greatest amount of useful computation at low cost
  5. However, if you have such a problem, then an MPP can be shockingly fast.
  6. Latency is the amount of time a message takes to traverse a system. In a computer network, it is an expression of how much time it takes for a packet of data to get from one designated point to another. It is sometimes measured as the time required for a packet to be returned to its sender.