“ SHARED MEMORY” MADE BY: SANJANA BAKSHI 7IT087 A PPT ON
TOPICS TO BE COVERED:
On chip memory
Bus based multiprocessor
Working through cache
Write through cache
Write once protocol
Ring based multiprocessor
Similarities and differences bw ring based and bus based
What is a DSM system?
A distributed-memory system (often called a multicomputer) consist of collection of workstations connected by a LAN share a single paged,virtual address space
Each page is present on exactly one maachine
An attempt to reference a page on different machine causes a hardware page fault which traps to operating system
The OS den sends a message to the remote machinewhich finds the needed page and sends it back to the req. processor
What is shared memory?
Shared memory is the memory that is simultaneously accessed by more than one CPU OR PROCCESSOR
There are local caches for each processor
It is cheaper to cache than main memory
It is simple to program and hard to scale
Various architectures to be discussed:
On chip memory
Bus based multiprocessors
Ring based multiprocessors
On Chip Memory
In this CPU portion of the chip has a address and data lines that directly connect to the memory portion
Such chips are used in cars,appliances and even toys
In hypothetical shared memory multiprocessor we have multiple CPU’S directly sharing the same memory but it would be complicated n expensive
CPU Memory CPU1 Memory CPU4 CPU2 CPU3 Chip package Address and data lines Connecting the CPU to the memory extension A single-chip computer A hypothetical shared-memory Multiprocessor.
What is a bus???
BUS is a collection of parallel wires,some holding the address the CPU wants to read or write,some for sending or receiving data and the rest for controlling the transfers.
In most systems buses are external and are used to connect CPU’S,MEMORIES AND I/O CONTROLLERS
Bus-based multiprocessors Bus-based multiprocessors BUS BASED MULTIPROCESSORS SMP : Symmetric Multi-Processing All CPUs connected to one bus (backplane) Memory and peripherals are accessed via shared bus. System looks the same from any processor. Bus CPU A CPU B memory Device I/O
Bus-based multiprocessors Dealing with bus overload - add local memory CPU does I/O to cache memory - access main memory on cache miss Bus memory Device I/O CPU A cache CPU B cache
Working with a cache CPU A reads location 12345 from memory Bus 12345:7 Device I/O CPU A 12345: 7 CPU B
Working with a cache CPU B reads location 12345 from memory Gets old value Memory not coherent! Bus 12345:7 Device I/O CPU A 12345: 3 CPU B 12345: 7
Write-through cache … continued CPU B reads location 12345 from memory - loads into cache Bus 12345:3 Device I/O CPU A 12345: 3 CPU B 12345: 3
Write-through cache CPU A modifies location 12345 - write-through 12345:3 12345: 3 Cache on CPU B not updated Memory not coherent! Bus Device I/O CPU A CPU B 12345: 3 12345:0 12345: 0
Write once protocol
This protocol manages cache blocks, each of which can be in one of the following three states:
INVALID: This cache block does not contain valid data.
CLEAN: Memory is up-to-date; the block may be in other caches.
DIRTY: Memory is incorrect; no other cache holds the block.
The basic idea of the protocol is that a word that is being read by multiple CPUs is allowed to be present in all their caches. A word that is being heavily written by only one machine is kept in its cache and not written back to memory on every write to reduce bus traffic.
Write through protocol Event Action taken by a cache in response to its own CPU’s operation Action taken by a cache in response to a remote CPU’s operation Read mis s Fetch data from memory and store in cache no action Read hit Fetch data from local cache no action Write miss Update data in memory and store in cache no action Write hit Update memory and cache invalidate cache entry
For example A B W 1 C W 1 CLEAN Memory is correct
Initial state – word W 1 containing
value W1 is in memory and is also
cached by B.
CPU A B W 1 C W 1 W 1 CLEAN CLEAN Memory is correct (b) A reades word W and gets W 1 . B does not respond to the read, but the memory does.
A B W 1 C W 2 W 1 A B W 1 C W 3 W 1 DIRTY INVALID DIRTY INVALID Memory is correct (c)A write a value W2, B snoops on the bus, sees the write, and invalidates its entry. A’s copy is marked DIRTY. Not update memory Memory is correct (d) A write W again. This and subsequent writes by A are done locally, without any bus traffic.
A B W 1 C W 3 W 1 INVALID INVALID DIRTY W 3 (e) C reads or writes W. A sees the request by snooping on the bus, provides the value, and invalidates its own entry. C now has the only valid copy. Not update memory
Ring-Based Multiprocessors : Memnet CPU CPU CPU CPU CPU CPU CPU Private memory MMU Cache Home memory Memory management unit Location Interrupt Home Exclusive Valid 0 1 2 3 The block table
When the CPU wants to read a word from shared memory, the memory address to be read is passed to the Memnet device, which checks the block table to see if the block is present. If so, the request is satisfied. If not, the Memnet device waits until it captures the circulating token, puts a request onto the ring. As the packet passes around the ring, each Memnet device along the way checks to see if it has the block needed. If so, it puts the block in the dummy field and modifies the packet header to inhibit subsequent machines from doing so.
If the requesting machine has no free space in its cache to hold the incoming block, to make space, it picks a cached block at random and sends it home. Blocks whose Home bit are set are never chosen because they are already home.
If the block containing the word to be written is present and is the only copy in the system (i.e., the Exclusive bit is set), the word is just written locally .
If the needed block is present but it is not the only copy, an invalidation packet is first sent around the ring to force all other machines to discard their copies of the block about to be written. When the invalidation packet arrives back at the sender, the Exclusive bit is set for that block and the write proceeds locally .
If the block is not present, a packet is sent out that combines a read request and an invalidation request. The first machine that has the block copies it into the packet and discards its own copy. All subsequent machines just discard the block from their caches. When the packet comes back to the sender, it is stored there and written .
Similarities bw bus based and ring based multiprocessors
In both cases read operations always return the values most recently written
In both designs a block may be absent from a cache,present in multiple caches for reading,or present in a single cache for writing
DIFFERENCES BW TWO MULTIPROCESSORS
BUS BASED MULTIPROCCESORS
They are tightly coupled with the CPU’S normally in a single rack
It has seprate global memory
RING BASED MULTIPROCCESORS
Machines here can be much more loosely coupled n this loose coupling can affect their performance