SlideShare a Scribd company logo
ADVANCED COMPUTER ARCHITECTURE
AND PARALLEL PROCESSING
Shared memory architecture
1- Shared memory systems form a major category of multiprocessors.
2- In this category all processors share a global memory.
3- Communication between tasks running on different processors is performed
through writing to and reading from the global memory.
4- All interprocessor coordination and synchronization is also accomplished via
the global memory.
5- A shared memory computer system consists of :
- a set of independent processors.
- a set of memory modules, and an interconnection network as shown in
following Figure 4.1
6- Two main problems need to be addressed when designing a shared memory
system:
• performance degradation due to contention, and
• coherence problems
2
Shared memory architecture cont.
3
Shared memory architecture cont
• Performance degradation might happen when multiple processors are trying to
access the shared memory simultaneously. ?
• Caches : having multiple copies of data, spread throughout the caches might lead
to a coherence problem
- The copies in the caches are coherent if they are all equal to the same value.
- If one of the processors writes over the value of one of the copies, then the copy
becomes inconsistent because it no longer equals the value of the other copies.
4
4.1 CLASSIFICATION OF SHARED
MEMORY SYSTEMS
• The simplest shared memory system consists of one memory module (M)
that can be accessed from two processors (P1 and P2).
• An arbitration unit within the memory module passes requests through to a
memory controller. If the memory module is not busy and a single request
arrives, then the arbitration unit passes that request to the memory
controller and the request is satisfied. The module is placed in the busy
state while a request is being serviced. If a new request arrives while the
memory is busy servicing a previous request, the memory module sends a
wait signal, through the memory controller, to the processor making the new
request. 5
4.1 CLASSIFICATION OF SHARED
MEMORY SYSTEMS cont.
• In response, the requesting processor may hold its request on the line until the
memory becomes free or it may repeat its request some time later. If the
arbitration unit receives two requests, it selects one of them and passes it to the
memory controller. Again, the denied request can be either held to be served
next or it may be repeated some time later.
• Categories of shared memory systems Based on the interconnection network:
 Uniform Memory Access (UMA) .
 Non-uniform Memory Access (NUMA)
 Cache Only Memory Architecture (COMA)
6
Uniform Memory Access (UMA)
 In the UMA system a shared memory is accessible by all processors through an
interconnection network in the same way a single processor accesses its
memory.
 All processors have equal access time to any memory location.
 The interconnection network used: a single bus, multiple buses, or a crossbar
switch(diagonal,straight).
 Called it SMP (symmetric multiprocessor) systems?
 access to shared memory is balanced (Each processor has equal opportunity to
read/write to memory, including equal access speed).
7
Nonuniform Memory Access
(NUMA)
• each processor has part of the shared memory attached. The memory has a
single address space. Therefore, any processor could access any memory
location directly using its real address. However, the access time to modules
depends on the distance to the processor.
• the tree and the hierarchical bus networks are architectures used to interconnect
processors.
8
Cache-Only Memory Architecture
(COMA)
• Similar to the NUMA, each processor has part of the shared memory in the
COMA. However, in this case the shared memory consists of cache memory.
• requires that data be migrated to the processor requesting it.
• There is no memory hierarchy and the address space is made of all the caches.
There is a cache directory (D) that helps in remote cache access.
9
4.2 BUS-BASED SYMMETRIC
MULTIPROCESSORS
 Shared memory systems can be designed using :
• bus-based (simplest network for shared memory systems).
• switch-based interconnection networks (message passing).
 The bus/cache architecture alleviates the need for expensive multi-ported
memories and interface circuitry as well as the need to adopt a message-
passing paradigm when developing application software.
 the bus may get saturated if multiple processors are trying to access the shared
memory (via the bus) simultaneously. (contention problem) solve uses Caches
 High speed caches connected to each processor on one side and the bus on the
other side mean that local copies of instructions and data can be supplied at the
highest possible rate.
 If the local processor finds all of its instructions and data in the local cache hit
rate is 100%, otherwise miss rate of a cache, so must be copied from the
global memory, across the bus, into the cache, and then passed on to the local
processor. 10
4.2 BUS-BASED SYMMETRIC
MULTIPROCESSORS cont.
 One of the goals of the cache is to maintain a high hit rate, or low miss rate
under high processor loads. A high hit rate means the processors are not using
the bus as much. Hit rates are determined by a number of factors, ranging from
the application programs being run to the manner in which cache hardware is
implemented.
11
4.2 BUS-BASED SYMMETRIC
MULTIPROCESSORS cont.
• We define the variables for hit rate, number of processors, processor speed, bus
speed, and processor duty cycle rates as follows:
 N = number of processors;
 h = hit rate of each cache, assumed to be the same for all caches;
 (1 - h) = miss rate of all caches;
 B = bandwidth of the bus, measured in cycles/second;
 I = processor duty cycle, assumed to be identical for all processors, in
fetches/cycle; and
 V = peak processor speed, in fetches/second
12
4.2 BUS-BASED SYMMETRIC
MULTIPROCESSORS cont.
• The effective bandwidth of the bus is BI fetches/second.
If each processor is running at a speed of V, then misses are being generated
at a rate of V(1 - h). For an N-processor system, misses are simultaneously
being generated at a rate of N(1 - h)V. This leads to saturation of the bus when
N processors simultaneously try to access the bus. That is, 𝑵 𝟏 − 𝒉 𝑽 ≤ 𝑩𝑰.
The maximum number of processors with cache memories that the bus can
support is given by the relation,
𝑁 ≤
𝐵𝐼
1−ℎ 𝑉
13
4.2 BUS-BASED SYMMETRIC
MULTIPROCESSORS cont.
• Example 1
Suppose a shared memory system is constructed from processors that can
execute V = 107 instructions/s and the processor duty cycle I = 1. The caches
are designed to support a hit rate of 97%, and the bus supports a peak
bandwidth of B = 106 cycles/s. Then, (1 - h) = 0.03, and the maximum number
of processors N Is 𝑁 ≤
106∗1
(0.03∗107)
=3.33.
Thus, the system we have in mind can support only three processors!
We might ask what hit rate is needed to support a 30-processor system. In this
case, h = 1- BI/NV = 1 - (106(1))/((30)(107)) = 1 - 1/300, so for the system we
have in mind, h = 0.9967. Increasing h by 2.8%. results in supporting a factor of
ten more processors.
14
4.3 BASIC CACHE COHERENCY
METHODS
• Multiple copies of data, spread throughout the caches, lead to a coherence
problem among the caches. The copies in the caches are coherent if they all
equal the same value. However, if one of the processors writes over the value of
one of the copies, then the copy becomes inconsistent because it no longer
equals the value of the other copies. If data are allowed to become inconsistent
(incoherent), incorrect results will be propagated through the system, leading to
incorrect final results. Cache coherence algorithms are needed to maintain a
level of consistency throughout the parallel system.
Copies data in the
caches
coherence inconsistent
15
Cache–Memory Coherence
 In a single cache system, coherence between memory and the cache is
maintained using one of two policies:
• write-through:
The memory is updated every time the cache is updated.
• write-back:
The memory is updated only when the block in the cache is being
replaced.
16
Cache–Memory Coherence
17
Cache–Cache Coherence
• In multiprocessing system, when a task running on processor P requests the data in
global memory location X, for example, the contents of X are copied to processor P’s
local cache, where it is passed on to P. Now, suppose processor Q also accesses X.
What happens if Q wants to write a new value over the old value of X?
 There are two fundamental cache coherence policies:
 write-invalidate
Maintains consistency by reading from local caches until a write occurs.
 write-update
Maintains consistency by immediately updating all copies in all caches.
18
Cache–Cache Coherence
19
Shared Memory System
Coherence
• The four combinations to maintain coherence among all caches and
global memory are:
 Write-update and write-through
 Write-update and write-back
 Write-invalidate and write-through
 Write-invalidate and write-back
• If we permit a write-update and write-through directly on global memory
location X, the bus would start to get busy and ultimately all processors
would be idle while waiting for writes to complete. In write-update and write-
back, only copies in all caches are updated. On the contrary, if the write is
limited to the copy of X in cache Q, the caches become inconsistent on X.
Setting the dirty bit prevents the spread of inconsistent values of X, but at
some point, the inconsistent copies must be updated.
20

More Related Content

What's hot

Pram model
Pram modelPram model
Pram model
MANASYJAYASURYA
 
Distributed system
Distributed systemDistributed system
Distributed system
Syed Zaid Irshad
 
Multi Processors And Multi Computers
 Multi Processors And Multi Computers Multi Processors And Multi Computers
Multi Processors And Multi Computers
Nemwos
 
11. dfs
11. dfs11. dfs
Parallel processing
Parallel processingParallel processing
Parallel processing
Syed Zaid Irshad
 
Parallel processing
Parallel processingParallel processing
Parallel processing
rajshreemuthiah
 
Shared Memory Multi Processor
Shared Memory Multi ProcessorShared Memory Multi Processor
Shared Memory Multi Processor
babuece
 
Multivector and multiprocessor
Multivector and multiprocessorMultivector and multiprocessor
Multivector and multiprocessorKishan Panara
 
file sharing semantics by Umar Danjuma Maiwada
file sharing semantics by Umar Danjuma Maiwada file sharing semantics by Umar Danjuma Maiwada
file sharing semantics by Umar Danjuma Maiwada
umardanjumamaiwada
 
Communication in Distributed Systems
Communication in Distributed SystemsCommunication in Distributed Systems
Communication in Distributed Systems
Dilum Bandara
 
Communication model of parallel platforms
Communication model of parallel platformsCommunication model of parallel platforms
Communication model of parallel platforms
Syed Zaid Irshad
 
Distributed shred memory architecture
Distributed shred memory architectureDistributed shred memory architecture
Distributed shred memory architecture
Maulik Togadiya
 
Processes and Processors in Distributed Systems
Processes and Processors in Distributed SystemsProcesses and Processors in Distributed Systems
Processes and Processors in Distributed Systems
Dr Sandeep Kumar Poonia
 
Design issues of dos
Design issues of dosDesign issues of dos
Design issues of dos
vanamali_vanu
 
Physical organization of parallel platforms
Physical organization of parallel platformsPhysical organization of parallel platforms
Physical organization of parallel platforms
Syed Zaid Irshad
 
Communication primitives
Communication primitivesCommunication primitives
Communication primitives
Student
 
Parallel computing
Parallel computingParallel computing
Parallel computing
Vinay Gupta
 
Inter Process Communication
Inter Process CommunicationInter Process Communication
Inter Process Communication
Adeel Rasheed
 
Distributed Shared Memory Systems
Distributed Shared Memory SystemsDistributed Shared Memory Systems
Distributed Shared Memory Systems
Arush Nagpal
 

What's hot (20)

Pram model
Pram modelPram model
Pram model
 
Distributed system
Distributed systemDistributed system
Distributed system
 
Multi Processors And Multi Computers
 Multi Processors And Multi Computers Multi Processors And Multi Computers
Multi Processors And Multi Computers
 
11. dfs
11. dfs11. dfs
11. dfs
 
Parallel processing
Parallel processingParallel processing
Parallel processing
 
Parallel processing
Parallel processingParallel processing
Parallel processing
 
Shared Memory Multi Processor
Shared Memory Multi ProcessorShared Memory Multi Processor
Shared Memory Multi Processor
 
Multivector and multiprocessor
Multivector and multiprocessorMultivector and multiprocessor
Multivector and multiprocessor
 
file sharing semantics by Umar Danjuma Maiwada
file sharing semantics by Umar Danjuma Maiwada file sharing semantics by Umar Danjuma Maiwada
file sharing semantics by Umar Danjuma Maiwada
 
Communication in Distributed Systems
Communication in Distributed SystemsCommunication in Distributed Systems
Communication in Distributed Systems
 
Communication model of parallel platforms
Communication model of parallel platformsCommunication model of parallel platforms
Communication model of parallel platforms
 
Distributed shred memory architecture
Distributed shred memory architectureDistributed shred memory architecture
Distributed shred memory architecture
 
Processes and Processors in Distributed Systems
Processes and Processors in Distributed SystemsProcesses and Processors in Distributed Systems
Processes and Processors in Distributed Systems
 
Design issues of dos
Design issues of dosDesign issues of dos
Design issues of dos
 
Physical organization of parallel platforms
Physical organization of parallel platformsPhysical organization of parallel platforms
Physical organization of parallel platforms
 
Communication primitives
Communication primitivesCommunication primitives
Communication primitives
 
1.prallelism
1.prallelism1.prallelism
1.prallelism
 
Parallel computing
Parallel computingParallel computing
Parallel computing
 
Inter Process Communication
Inter Process CommunicationInter Process Communication
Inter Process Communication
 
Distributed Shared Memory Systems
Distributed Shared Memory SystemsDistributed Shared Memory Systems
Distributed Shared Memory Systems
 

Similar to ADVANCED COMPUTER ARCHITECTURE AND PARALLEL PROCESSING

22CS201 COA
22CS201 COA22CS201 COA
22CS201 COA
Kathirvel Ayyaswamy
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
ssuser5c9d4b1
 
assignment_presentaion_jhvvnvhjhbhjhvjh.pptx
assignment_presentaion_jhvvnvhjhbhjhvjh.pptxassignment_presentaion_jhvvnvhjhbhjhvjh.pptx
assignment_presentaion_jhvvnvhjhbhjhvjh.pptx
23mu36
 
OS_MD_4.pdf
OS_MD_4.pdfOS_MD_4.pdf
OS_MD_4.pdf
SangeethaBS4
 
Multiprocessor structures
Multiprocessor structuresMultiprocessor structures
Multiprocessor structures
Shareb Ismaeel
 
Chapter 10
Chapter 10Chapter 10
Lecture 6
Lecture  6Lecture  6
Lecture 6Mr SMAK
 
Lecture 6
Lecture  6Lecture  6
Lecture 6Mr SMAK
 
Lecture 6
Lecture  6Lecture  6
Lecture 6Mr SMAK
 
Distributed system lectures
Distributed system lecturesDistributed system lectures
Distributed system lectures
marwaeng
 
Cache Coherence.pptx
Cache Coherence.pptxCache Coherence.pptx
Cache Coherence.pptx
SamyakJain710491
 
Intro_ppt.pptx
Intro_ppt.pptxIntro_ppt.pptx
Intro_ppt.pptx
ssuser906c831
 
Memory and Cache Coherence in Multiprocessor System.pdf
Memory and Cache Coherence in Multiprocessor System.pdfMemory and Cache Coherence in Multiprocessor System.pdf
Memory and Cache Coherence in Multiprocessor System.pdf
rajaratna4
 
Unit 5 lect-3-multiprocessor
Unit 5 lect-3-multiprocessorUnit 5 lect-3-multiprocessor
Unit 5 lect-3-multiprocessor
vishal choudhary
 
Symmetric multiprocessing and Microkernel
Symmetric multiprocessing and MicrokernelSymmetric multiprocessing and Microkernel
Symmetric multiprocessing and Microkernel
Manoraj Pannerselum
 
Computer architecture multi processor
Computer architecture multi processorComputer architecture multi processor
Computer architecture multi processor
Mazin Alwaaly
 

Similar to ADVANCED COMPUTER ARCHITECTURE AND PARALLEL PROCESSING (20)

Week5
Week5Week5
Week5
 
22CS201 COA
22CS201 COA22CS201 COA
22CS201 COA
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
 
NUMA
NUMANUMA
NUMA
 
assignment_presentaion_jhvvnvhjhbhjhvjh.pptx
assignment_presentaion_jhvvnvhjhbhjhvjh.pptxassignment_presentaion_jhvvnvhjhbhjhvjh.pptx
assignment_presentaion_jhvvnvhjhbhjhvjh.pptx
 
OS_MD_4.pdf
OS_MD_4.pdfOS_MD_4.pdf
OS_MD_4.pdf
 
Multiprocessor structures
Multiprocessor structuresMultiprocessor structures
Multiprocessor structures
 
Chapter 10
Chapter 10Chapter 10
Chapter 10
 
Lecture 6
Lecture  6Lecture  6
Lecture 6
 
Lecture 6
Lecture  6Lecture  6
Lecture 6
 
Lecture 6
Lecture  6Lecture  6
Lecture 6
 
Distributed system lectures
Distributed system lecturesDistributed system lectures
Distributed system lectures
 
Cache Coherence.pptx
Cache Coherence.pptxCache Coherence.pptx
Cache Coherence.pptx
 
Intro_ppt.pptx
Intro_ppt.pptxIntro_ppt.pptx
Intro_ppt.pptx
 
Memory and Cache Coherence in Multiprocessor System.pdf
Memory and Cache Coherence in Multiprocessor System.pdfMemory and Cache Coherence in Multiprocessor System.pdf
Memory and Cache Coherence in Multiprocessor System.pdf
 
Unit 5 lect-3-multiprocessor
Unit 5 lect-3-multiprocessorUnit 5 lect-3-multiprocessor
Unit 5 lect-3-multiprocessor
 
Coa presentation5
Coa presentation5Coa presentation5
Coa presentation5
 
Symmetric multiprocessing and Microkernel
Symmetric multiprocessing and MicrokernelSymmetric multiprocessing and Microkernel
Symmetric multiprocessing and Microkernel
 
Computer architecture multi processor
Computer architecture multi processorComputer architecture multi processor
Computer architecture multi processor
 
Mesi
MesiMesi
Mesi
 

Recently uploaded

Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
g2nightmarescribd
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 

Recently uploaded (20)

Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 

ADVANCED COMPUTER ARCHITECTURE AND PARALLEL PROCESSING

  • 2. Shared memory architecture 1- Shared memory systems form a major category of multiprocessors. 2- In this category all processors share a global memory. 3- Communication between tasks running on different processors is performed through writing to and reading from the global memory. 4- All interprocessor coordination and synchronization is also accomplished via the global memory. 5- A shared memory computer system consists of : - a set of independent processors. - a set of memory modules, and an interconnection network as shown in following Figure 4.1 6- Two main problems need to be addressed when designing a shared memory system: • performance degradation due to contention, and • coherence problems 2
  • 4. Shared memory architecture cont • Performance degradation might happen when multiple processors are trying to access the shared memory simultaneously. ? • Caches : having multiple copies of data, spread throughout the caches might lead to a coherence problem - The copies in the caches are coherent if they are all equal to the same value. - If one of the processors writes over the value of one of the copies, then the copy becomes inconsistent because it no longer equals the value of the other copies. 4
  • 5. 4.1 CLASSIFICATION OF SHARED MEMORY SYSTEMS • The simplest shared memory system consists of one memory module (M) that can be accessed from two processors (P1 and P2). • An arbitration unit within the memory module passes requests through to a memory controller. If the memory module is not busy and a single request arrives, then the arbitration unit passes that request to the memory controller and the request is satisfied. The module is placed in the busy state while a request is being serviced. If a new request arrives while the memory is busy servicing a previous request, the memory module sends a wait signal, through the memory controller, to the processor making the new request. 5
  • 6. 4.1 CLASSIFICATION OF SHARED MEMORY SYSTEMS cont. • In response, the requesting processor may hold its request on the line until the memory becomes free or it may repeat its request some time later. If the arbitration unit receives two requests, it selects one of them and passes it to the memory controller. Again, the denied request can be either held to be served next or it may be repeated some time later. • Categories of shared memory systems Based on the interconnection network:  Uniform Memory Access (UMA) .  Non-uniform Memory Access (NUMA)  Cache Only Memory Architecture (COMA) 6
  • 7. Uniform Memory Access (UMA)  In the UMA system a shared memory is accessible by all processors through an interconnection network in the same way a single processor accesses its memory.  All processors have equal access time to any memory location.  The interconnection network used: a single bus, multiple buses, or a crossbar switch(diagonal,straight).  Called it SMP (symmetric multiprocessor) systems?  access to shared memory is balanced (Each processor has equal opportunity to read/write to memory, including equal access speed). 7
  • 8. Nonuniform Memory Access (NUMA) • each processor has part of the shared memory attached. The memory has a single address space. Therefore, any processor could access any memory location directly using its real address. However, the access time to modules depends on the distance to the processor. • the tree and the hierarchical bus networks are architectures used to interconnect processors. 8
  • 9. Cache-Only Memory Architecture (COMA) • Similar to the NUMA, each processor has part of the shared memory in the COMA. However, in this case the shared memory consists of cache memory. • requires that data be migrated to the processor requesting it. • There is no memory hierarchy and the address space is made of all the caches. There is a cache directory (D) that helps in remote cache access. 9
  • 10. 4.2 BUS-BASED SYMMETRIC MULTIPROCESSORS  Shared memory systems can be designed using : • bus-based (simplest network for shared memory systems). • switch-based interconnection networks (message passing).  The bus/cache architecture alleviates the need for expensive multi-ported memories and interface circuitry as well as the need to adopt a message- passing paradigm when developing application software.  the bus may get saturated if multiple processors are trying to access the shared memory (via the bus) simultaneously. (contention problem) solve uses Caches  High speed caches connected to each processor on one side and the bus on the other side mean that local copies of instructions and data can be supplied at the highest possible rate.  If the local processor finds all of its instructions and data in the local cache hit rate is 100%, otherwise miss rate of a cache, so must be copied from the global memory, across the bus, into the cache, and then passed on to the local processor. 10
  • 11. 4.2 BUS-BASED SYMMETRIC MULTIPROCESSORS cont.  One of the goals of the cache is to maintain a high hit rate, or low miss rate under high processor loads. A high hit rate means the processors are not using the bus as much. Hit rates are determined by a number of factors, ranging from the application programs being run to the manner in which cache hardware is implemented. 11
  • 12. 4.2 BUS-BASED SYMMETRIC MULTIPROCESSORS cont. • We define the variables for hit rate, number of processors, processor speed, bus speed, and processor duty cycle rates as follows:  N = number of processors;  h = hit rate of each cache, assumed to be the same for all caches;  (1 - h) = miss rate of all caches;  B = bandwidth of the bus, measured in cycles/second;  I = processor duty cycle, assumed to be identical for all processors, in fetches/cycle; and  V = peak processor speed, in fetches/second 12
  • 13. 4.2 BUS-BASED SYMMETRIC MULTIPROCESSORS cont. • The effective bandwidth of the bus is BI fetches/second. If each processor is running at a speed of V, then misses are being generated at a rate of V(1 - h). For an N-processor system, misses are simultaneously being generated at a rate of N(1 - h)V. This leads to saturation of the bus when N processors simultaneously try to access the bus. That is, 𝑵 𝟏 − 𝒉 𝑽 ≤ 𝑩𝑰. The maximum number of processors with cache memories that the bus can support is given by the relation, 𝑁 ≤ 𝐵𝐼 1−ℎ 𝑉 13
  • 14. 4.2 BUS-BASED SYMMETRIC MULTIPROCESSORS cont. • Example 1 Suppose a shared memory system is constructed from processors that can execute V = 107 instructions/s and the processor duty cycle I = 1. The caches are designed to support a hit rate of 97%, and the bus supports a peak bandwidth of B = 106 cycles/s. Then, (1 - h) = 0.03, and the maximum number of processors N Is 𝑁 ≤ 106∗1 (0.03∗107) =3.33. Thus, the system we have in mind can support only three processors! We might ask what hit rate is needed to support a 30-processor system. In this case, h = 1- BI/NV = 1 - (106(1))/((30)(107)) = 1 - 1/300, so for the system we have in mind, h = 0.9967. Increasing h by 2.8%. results in supporting a factor of ten more processors. 14
  • 15. 4.3 BASIC CACHE COHERENCY METHODS • Multiple copies of data, spread throughout the caches, lead to a coherence problem among the caches. The copies in the caches are coherent if they all equal the same value. However, if one of the processors writes over the value of one of the copies, then the copy becomes inconsistent because it no longer equals the value of the other copies. If data are allowed to become inconsistent (incoherent), incorrect results will be propagated through the system, leading to incorrect final results. Cache coherence algorithms are needed to maintain a level of consistency throughout the parallel system. Copies data in the caches coherence inconsistent 15
  • 16. Cache–Memory Coherence  In a single cache system, coherence between memory and the cache is maintained using one of two policies: • write-through: The memory is updated every time the cache is updated. • write-back: The memory is updated only when the block in the cache is being replaced. 16
  • 18. Cache–Cache Coherence • In multiprocessing system, when a task running on processor P requests the data in global memory location X, for example, the contents of X are copied to processor P’s local cache, where it is passed on to P. Now, suppose processor Q also accesses X. What happens if Q wants to write a new value over the old value of X?  There are two fundamental cache coherence policies:  write-invalidate Maintains consistency by reading from local caches until a write occurs.  write-update Maintains consistency by immediately updating all copies in all caches. 18
  • 20. Shared Memory System Coherence • The four combinations to maintain coherence among all caches and global memory are:  Write-update and write-through  Write-update and write-back  Write-invalidate and write-through  Write-invalidate and write-back • If we permit a write-update and write-through directly on global memory location X, the bus would start to get busy and ultimately all processors would be idle while waiting for writes to complete. In write-update and write- back, only copies in all caches are updated. On the contrary, if the write is limited to the copy of X in cache Q, the caches become inconsistent on X. Setting the dirty bit prevents the spread of inconsistent values of X, but at some point, the inconsistent copies must be updated. 20