SlideShare a Scribd company logo
1 of 10
1
Bottomline
• Can’t escape multi-cores today: it is the baseline
architecture
• Performance stagnates unless we learn to transform
traditional applications into parallel threads
• It’s all about the data!
Data management: distribution, coherence, consistency
• It’s also about the programming model: onus on
application writer / compiler / hardware
• It’s also about managing on-chip communication
2
Symmetric Multiprocessors (SMP)
• A collection of processors, a collection of memory: both
are connected through some interconnect (usually, the
fastest possible)
• Symmetric because latency for any processor to access
any memory is constant – uniform memory access (UMA)
Proc 1 Proc 2 Proc 3 Proc 4
Mem 1 Mem 2 Mem 3 Mem 4
3
Distributed Memory Multiprocessors
• Each processor has local memory that is accessible
through a fast interconnect
• The different nodes are connected as I/O devices with
(potentially) slower interconnect
• Local memory access is a lot faster than remote memory
– non-uniform memory access (NUMA)
• Advantage: can be built with commodity processors and
many applications will perform well thanks to locality
Proc 1 Mem 1 Proc 2 Mem 2 Proc 3 Mem 3 Proc 4 Mem 4
4
Shared Memory Architectures
• Key differentiating feature: the address space is shared,
i.e., any processor can directly address any memory
location and access them with load/store instructions
• Cooperation is similar to a bulletin board – a processor
writes to a location and that location is visible to reads
by other threads
5
Shared Address Space
Shared
Private
Private
Private
Process P1
Process P2
Process P3
Shared
Shared
Shared
Pvt P1
Pvt P2
Pvt P3
Virtual address space
of each process
Physical address space
6
Message Passing
• Programming model that can apply to clusters of workstations, SMPs,
and even a uniprocessor
• Sends and receives are used for effecting the data transfer – usually,
each process ends up making a copy of data that is relevant to it
• Each process can only name local addresses, other processes, and
a tag to help distinguish between multiple messages
• A send-receive match is a synchronization event – hence, we no
longer need locks or barriers to co-ordinate
7
Models for SEND and RECEIVE
• Synchronous: SEND returns control back to the program
only when the RECEIVE has completed
• Blocking Asynchronous: SEND returns control back to the
program after the OS has copied the message into its space
-- the program can now modify the sent data structure
• Nonblocking Asynchronous: SEND and RECEIVE return
control immediately – the message will get copied at some
point, so the process must overlap some other computation
with the communication – other primitives are used to
probe if the communication has finished or not
8
Deterministic Execution
• Need synch after every anti-diagonal
• Potential load imbalance
• Shared-memory vs. message passing
• Function of the model for SEND-RECEIVE
• Function of the algorithm: diagonal, red-black ordering
9
Cache Coherence
A multiprocessor system is cache coherent if
• a value written by a processor is eventually visible to
reads by other processors – write propagation
• two writes to the same location by two processors are
seen in the same order by all processors – write
serialization
10
Cache Coherence Protocols
• Directory-based: A single location (directory) keeps track
of the sharing status of a block of memory
• Snooping: Every cache block is accompanied by the sharing
status of that block – all cache controllers monitor the
shared bus so they can update the sharing status of the
block, if necessary
 Write-invalidate: a processor gains exclusive access of
a block before writing by invalidating all other copies
 Write-update: when a processor writes, it updates other
shared copies of that block

More Related Content

What's hot

Parallel architecture
Parallel architectureParallel architecture
Parallel architectureMr SMAK
 
Parallel architecture-programming
Parallel architecture-programmingParallel architecture-programming
Parallel architecture-programmingShaveta Banda
 
Parallel Processing Presentation2
Parallel Processing Presentation2Parallel Processing Presentation2
Parallel Processing Presentation2daniyalqureshi712
 
Introduction to parallel processing
Introduction to parallel processingIntroduction to parallel processing
Introduction to parallel processingPage Maker
 
Multithreaded processors ppt
Multithreaded processors pptMultithreaded processors ppt
Multithreaded processors pptSiddhartha Anand
 
Multithreading computer architecture
 Multithreading computer architecture  Multithreading computer architecture
Multithreading computer architecture Haris456
 
Hardware multithreading
Hardware multithreadingHardware multithreading
Hardware multithreadingFraboni Ec
 
What is simultaneous multithreading
What is simultaneous multithreadingWhat is simultaneous multithreading
What is simultaneous multithreadingFraboni Ec
 
Dichotomy of parallel computing platforms
Dichotomy of parallel computing platformsDichotomy of parallel computing platforms
Dichotomy of parallel computing platformsSyed Zaid Irshad
 
Superscalar & superpipeline processor
Superscalar & superpipeline processorSuperscalar & superpipeline processor
Superscalar & superpipeline processorMuhammad Ishaq
 
Cache coherence
Cache coherenceCache coherence
Cache coherenceEmployee
 
Client-centric Consistency Models
Client-centric Consistency ModelsClient-centric Consistency Models
Client-centric Consistency ModelsEnsar Basri Kahveci
 
Parallel computing
Parallel computingParallel computing
Parallel computingvirend111
 

What's hot (20)

Memory models
Memory modelsMemory models
Memory models
 
Parallel architecture
Parallel architectureParallel architecture
Parallel architecture
 
Parallel architecture-programming
Parallel architecture-programmingParallel architecture-programming
Parallel architecture-programming
 
Parallel processing extra
Parallel processing extraParallel processing extra
Parallel processing extra
 
Parallel processing
Parallel processingParallel processing
Parallel processing
 
Parallel processing
Parallel processingParallel processing
Parallel processing
 
Parallel Processing Presentation2
Parallel Processing Presentation2Parallel Processing Presentation2
Parallel Processing Presentation2
 
Cache coherence
Cache coherenceCache coherence
Cache coherence
 
Cache coherence
Cache coherenceCache coherence
Cache coherence
 
Introduction to parallel processing
Introduction to parallel processingIntroduction to parallel processing
Introduction to parallel processing
 
Scope of parallelism
Scope of parallelismScope of parallelism
Scope of parallelism
 
Multithreaded processors ppt
Multithreaded processors pptMultithreaded processors ppt
Multithreaded processors ppt
 
Multithreading computer architecture
 Multithreading computer architecture  Multithreading computer architecture
Multithreading computer architecture
 
Hardware multithreading
Hardware multithreadingHardware multithreading
Hardware multithreading
 
What is simultaneous multithreading
What is simultaneous multithreadingWhat is simultaneous multithreading
What is simultaneous multithreading
 
Dichotomy of parallel computing platforms
Dichotomy of parallel computing platformsDichotomy of parallel computing platforms
Dichotomy of parallel computing platforms
 
Superscalar & superpipeline processor
Superscalar & superpipeline processorSuperscalar & superpipeline processor
Superscalar & superpipeline processor
 
Cache coherence
Cache coherenceCache coherence
Cache coherence
 
Client-centric Consistency Models
Client-centric Consistency ModelsClient-centric Consistency Models
Client-centric Consistency Models
 
Parallel computing
Parallel computingParallel computing
Parallel computing
 

Viewers also liked

Genatic Algorithm
Genatic AlgorithmGenatic Algorithm
Genatic AlgorithmYasir Khan
 
Vo gay philipin
Vo gay philipinVo gay philipin
Vo gay philipinNgo Kim Du
 
Frstorder 9 sldes read
Frstorder 9 sldes readFrstorder 9 sldes read
Frstorder 9 sldes readYasir Khan
 
Pringkat pringkatperkembangankanak-kanakdanteori-teoriperkembanganyangberkait...
Pringkat pringkatperkembangankanak-kanakdanteori-teoriperkembanganyangberkait...Pringkat pringkatperkembangankanak-kanakdanteori-teoriperkembanganyangberkait...
Pringkat pringkatperkembangankanak-kanakdanteori-teoriperkembanganyangberkait...Rozaidi Yusof
 
Dir based imp_5
Dir based imp_5Dir based imp_5
Dir based imp_5Yasir Khan
 
computer graphics at openGL (2)
computer graphics at openGL (2)computer graphics at openGL (2)
computer graphics at openGL (2)Yasir Khan
 
Cs ps, sat, fol resolution strategies
Cs ps, sat, fol resolution strategiesCs ps, sat, fol resolution strategies
Cs ps, sat, fol resolution strategiesYasir Khan
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingYasir Khan
 
Information System Architecture and Audit Control Lecture 2
Information System Architecture and Audit Control Lecture 2Information System Architecture and Audit Control Lecture 2
Information System Architecture and Audit Control Lecture 2Yasir Khan
 
36760501 teori-dan-model-an-awal-kanak-kank
36760501 teori-dan-model-an-awal-kanak-kank36760501 teori-dan-model-an-awal-kanak-kank
36760501 teori-dan-model-an-awal-kanak-kankRozaidi Yusof
 
Knowledge Representation in Artificial intelligence
Knowledge Representation in Artificial intelligence Knowledge Representation in Artificial intelligence
Knowledge Representation in Artificial intelligence Yasir Khan
 

Viewers also liked (18)

Hpc 2
Hpc 2Hpc 2
Hpc 2
 
Genatic Algorithm
Genatic AlgorithmGenatic Algorithm
Genatic Algorithm
 
M6 game
M6 gameM6 game
M6 game
 
Hpc 6 7
Hpc 6 7Hpc 6 7
Hpc 6 7
 
M2 agents
M2 agentsM2 agents
M2 agents
 
Vo gay philipin
Vo gay philipinVo gay philipin
Vo gay philipin
 
Uncertainity
Uncertainity Uncertainity
Uncertainity
 
Hpc 4 5
Hpc 4 5Hpc 4 5
Hpc 4 5
 
Frstorder 9 sldes read
Frstorder 9 sldes readFrstorder 9 sldes read
Frstorder 9 sldes read
 
Pringkat pringkatperkembangankanak-kanakdanteori-teoriperkembanganyangberkait...
Pringkat pringkatperkembangankanak-kanakdanteori-teoriperkembanganyangberkait...Pringkat pringkatperkembangankanak-kanakdanteori-teoriperkembanganyangberkait...
Pringkat pringkatperkembangankanak-kanakdanteori-teoriperkembanganyangberkait...
 
Dir based imp_5
Dir based imp_5Dir based imp_5
Dir based imp_5
 
computer graphics at openGL (2)
computer graphics at openGL (2)computer graphics at openGL (2)
computer graphics at openGL (2)
 
C language
C languageC language
C language
 
Cs ps, sat, fol resolution strategies
Cs ps, sat, fol resolution strategiesCs ps, sat, fol resolution strategies
Cs ps, sat, fol resolution strategies
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Information System Architecture and Audit Control Lecture 2
Information System Architecture and Audit Control Lecture 2Information System Architecture and Audit Control Lecture 2
Information System Architecture and Audit Control Lecture 2
 
36760501 teori-dan-model-an-awal-kanak-kank
36760501 teori-dan-model-an-awal-kanak-kank36760501 teori-dan-model-an-awal-kanak-kank
36760501 teori-dan-model-an-awal-kanak-kank
 
Knowledge Representation in Artificial intelligence
Knowledge Representation in Artificial intelligence Knowledge Representation in Artificial intelligence
Knowledge Representation in Artificial intelligence
 

Similar to Introduction 1

Similar to Introduction 1 (20)

Multiprocessor.pptx
 Multiprocessor.pptx Multiprocessor.pptx
Multiprocessor.pptx
 
22CS201 COA
22CS201 COA22CS201 COA
22CS201 COA
 
unit 4.pptx
unit 4.pptxunit 4.pptx
unit 4.pptx
 
unit 4.pptx
unit 4.pptxunit 4.pptx
unit 4.pptx
 
Lecture5
Lecture5Lecture5
Lecture5
 
Parallel & Distributed processing
Parallel & Distributed processingParallel & Distributed processing
Parallel & Distributed processing
 
Multiprocessor_YChen.ppt
Multiprocessor_YChen.pptMultiprocessor_YChen.ppt
Multiprocessor_YChen.ppt
 
Module2 MultiThreads.ppt
Module2 MultiThreads.pptModule2 MultiThreads.ppt
Module2 MultiThreads.ppt
 
High performance computing
High performance computingHigh performance computing
High performance computing
 
Snooping 2
Snooping 2Snooping 2
Snooping 2
 
parallel-processing.ppt
parallel-processing.pptparallel-processing.ppt
parallel-processing.ppt
 
Lecture 2
Lecture 2Lecture 2
Lecture 2
 
18 parallel processing
18 parallel processing18 parallel processing
18 parallel processing
 
message passing vs shared memory
message passing vs shared memorymessage passing vs shared memory
message passing vs shared memory
 
Lecture-7 Main Memroy.pptx
Lecture-7 Main Memroy.pptxLecture-7 Main Memroy.pptx
Lecture-7 Main Memroy.pptx
 
CA UNIT IV.pptx
CA UNIT IV.pptxCA UNIT IV.pptx
CA UNIT IV.pptx
 
Multicore and shared multi processor
Multicore and shared multi processorMulticore and shared multi processor
Multicore and shared multi processor
 
Multicore and shared multi processor
Multicore and shared multi processorMulticore and shared multi processor
Multicore and shared multi processor
 
6.distributed shared memory
6.distributed shared memory6.distributed shared memory
6.distributed shared memory
 
CSA unit5.pptx
CSA unit5.pptxCSA unit5.pptx
CSA unit5.pptx
 

More from Yasir Khan (19)

Lecture 6
Lecture 6Lecture 6
Lecture 6
 
Lecture 4
Lecture 4Lecture 4
Lecture 4
 
Lecture 3
Lecture 3Lecture 3
Lecture 3
 
Lecture 2
Lecture 2Lecture 2
Lecture 2
 
Lec#1
Lec#1Lec#1
Lec#1
 
Ch10 (1)
Ch10 (1)Ch10 (1)
Ch10 (1)
 
Ch09
Ch09Ch09
Ch09
 
Ch05
Ch05Ch05
Ch05
 
Snooping protocols 3
Snooping protocols 3Snooping protocols 3
Snooping protocols 3
 
Hpc sys
Hpc sysHpc sys
Hpc sys
 
Hpc 3
Hpc 3Hpc 3
Hpc 3
 
Hpc 1
Hpc 1Hpc 1
Hpc 1
 
Flynns classification
Flynns classificationFlynns classification
Flynns classification
 
Logic
LogicLogic
Logic
 
M4 heuristics
M4 heuristicsM4 heuristics
M4 heuristics
 
M3 search
M3 searchM3 search
M3 search
 
M1 intro
M1 introM1 intro
M1 intro
 
Expert system 21 sldes
Expert system 21 sldesExpert system 21 sldes
Expert system 21 sldes
 
AI Robotics
AI RoboticsAI Robotics
AI Robotics
 

Recently uploaded

Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitolTechU
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxEyham Joco
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxAvyJaneVismanos
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfadityarao40181
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupJonathanParaisoCruz
 

Recently uploaded (20)

Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptx
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptx
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptx
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdf
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized Group
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 

Introduction 1

  • 1. 1 Bottomline • Can’t escape multi-cores today: it is the baseline architecture • Performance stagnates unless we learn to transform traditional applications into parallel threads • It’s all about the data! Data management: distribution, coherence, consistency • It’s also about the programming model: onus on application writer / compiler / hardware • It’s also about managing on-chip communication
  • 2. 2 Symmetric Multiprocessors (SMP) • A collection of processors, a collection of memory: both are connected through some interconnect (usually, the fastest possible) • Symmetric because latency for any processor to access any memory is constant – uniform memory access (UMA) Proc 1 Proc 2 Proc 3 Proc 4 Mem 1 Mem 2 Mem 3 Mem 4
  • 3. 3 Distributed Memory Multiprocessors • Each processor has local memory that is accessible through a fast interconnect • The different nodes are connected as I/O devices with (potentially) slower interconnect • Local memory access is a lot faster than remote memory – non-uniform memory access (NUMA) • Advantage: can be built with commodity processors and many applications will perform well thanks to locality Proc 1 Mem 1 Proc 2 Mem 2 Proc 3 Mem 3 Proc 4 Mem 4
  • 4. 4 Shared Memory Architectures • Key differentiating feature: the address space is shared, i.e., any processor can directly address any memory location and access them with load/store instructions • Cooperation is similar to a bulletin board – a processor writes to a location and that location is visible to reads by other threads
  • 5. 5 Shared Address Space Shared Private Private Private Process P1 Process P2 Process P3 Shared Shared Shared Pvt P1 Pvt P2 Pvt P3 Virtual address space of each process Physical address space
  • 6. 6 Message Passing • Programming model that can apply to clusters of workstations, SMPs, and even a uniprocessor • Sends and receives are used for effecting the data transfer – usually, each process ends up making a copy of data that is relevant to it • Each process can only name local addresses, other processes, and a tag to help distinguish between multiple messages • A send-receive match is a synchronization event – hence, we no longer need locks or barriers to co-ordinate
  • 7. 7 Models for SEND and RECEIVE • Synchronous: SEND returns control back to the program only when the RECEIVE has completed • Blocking Asynchronous: SEND returns control back to the program after the OS has copied the message into its space -- the program can now modify the sent data structure • Nonblocking Asynchronous: SEND and RECEIVE return control immediately – the message will get copied at some point, so the process must overlap some other computation with the communication – other primitives are used to probe if the communication has finished or not
  • 8. 8 Deterministic Execution • Need synch after every anti-diagonal • Potential load imbalance • Shared-memory vs. message passing • Function of the model for SEND-RECEIVE • Function of the algorithm: diagonal, red-black ordering
  • 9. 9 Cache Coherence A multiprocessor system is cache coherent if • a value written by a processor is eventually visible to reads by other processors – write propagation • two writes to the same location by two processors are seen in the same order by all processors – write serialization
  • 10. 10 Cache Coherence Protocols • Directory-based: A single location (directory) keeps track of the sharing status of a block of memory • Snooping: Every cache block is accompanied by the sharing status of that block – all cache controllers monitor the shared bus so they can update the sharing status of the block, if necessary  Write-invalidate: a processor gains exclusive access of a block before writing by invalidating all other copies  Write-update: when a processor writes, it updates other shared copies of that block