Lecture6

486
-1

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
486
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
11
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Lecture6

  1. 1. High Performance Computing<br />JawwadShamsi<br />Lecture #6<br />27th January 2010<br />
  2. 2. Recap<br />Cache Coherence<br />NUMA<br />
  3. 3. Today’s topics<br />Cache Coherence – Continuation<br />Vector Processing<br />
  4. 4. Cache Coherence<br />In SMP or NUMA, multiple copies of cache<br />Each copy may have a different value of data item<br />Maintain Coherency<br />How?<br />
  5. 5. Cache Coherence: Two Approaches<br />Write back: Update Main memory once cache is flushed.<br />Write through: Write is updated to cache as well as to the main memory.<br />
  6. 6. Implementations<br />Software Solutions: <br />Compile time decision<br />Conservative<br />Inefficient cache utilization<br />Hardware Solutions:<br />Runtime decision<br />More effective<br />
  7. 7. Hardware based solution<br />Directory Protocol<br />Snoopy Protocol<br />
  8. 8. Directory<br />Centralized Controller<br />Individual cache controller makes a request<br />Centralized controller checks and issues command<br />Updates information<br />
  9. 9. Directory<br />Write<br />Processor requests exclusive writes<br />Controller sends message<br />Invalidates<br />Read<br />Issues command to the processor <br />Holding Processor<br />Writes back to MM<br />Read permitted<br />
  10. 10. Directory<br />Disadvantage<br />Centralized Controller<br />Bottleneck<br />Advantage<br />Useful in large –scale system<br />
  11. 11. Snoopy Protocol<br />Update operation announced<br />All Cache controllers snoop<br />Bus architecture<br />Careful<br />Increased Bus Traffic<br />
  12. 12. Snoopy Protocol<br />Two approaches<br />Write Invalidate<br />One write<br />Multiple readers<br />Exclusive: Writer invalidates others entries<br />Write Update<br />Multiple writers<br />All writes are updated<br />
  13. 13. Write Invalidate<br />The MESI Protocol : P4 processor<br />Data cache: Two status bits, 4 states<br />Modified<br />Exclusive<br />Shared<br />Invalid<br />See Table<br />
  14. 14. 4 Possibilities<br />Read Miss:<br />EX to SH<br />SH to SH<br />MO to SH<br />Read-Hit<br />Write-Miss<br />RWITM<br />MO to IN<br />SH to IN<br />Write Hit<br />SH to IN<br />EX <br />Mo<br />
  15. 15. L1- L2 Cache Consistency<br />
  16. 16. Parallel programming and Amdahl&apos;s Law<br />Suppose 1/N time for sequential code<br />And 1-1/N for the parallel<br />
  17. 17. Amdahl&apos;s Law<br />Speedup: speed gain of using parallel processor vs. single processor<br />Speed= 1/(s+(p/N))<br />S=sequential code, p = parallel code, N= no. of processors<br />S= T(1)/ T(j)<br />For j parallel processors<br />As problem size increases, p may rise and s may decrease<br />
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×