Lecture4

1,463 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,463
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
31
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Lecture4

  1. 1. High Performance Computing Jawwad Shamsi Lecture #4 25th January 2010
  2. 2. Recap • Pipelining • Super Scalar Execution – Dependencies • Branch • Data • Resource • Effect of Latency and Memory • Effect of Parallelism
  3. 3. Flynn’s taxanomy It distinguishes multiprocessor computers according to data and instruction • the dimensions of Instruction and Data • SISD: Single Instruction Single data (Uniprocessor) • SIMD: Single Instruction Multiple data (Vector Processing) • MISD: Multiple Instruction Single date • MIMD: Multiple Instruction Multiple data (SMP, cluster, NUMA)
  4. 4. MIMD • MIMD – Shared Memory (tightly coupled) • SMP (Symmetric Multiprocessing) • Non-Uniform Memory access – Distributed Memory (loosely coupled) • Clusters
  5. 5. Taxonomy of Parallel Processor Architectures
  6. 6. Shared Address Space • Shared Memory • Distributed Memory
  7. 7. SMP • Two or more similar processors • Same main memory and I/O • Can perform similar operations • Share access to I/O devices
  8. 8. Multiprogramming and Multiprocessing
  9. 9. SMP Advantages • Performance • Availability • Incremental growth • Scaling
  10. 10. Block Diagram of Tightly Coupled Multiprocessor
  11. 11. Cache Coherence • Multiple copies of cache can maintain different data – Protocols?
  12. 12. Processor Design: Modes of Parallelism • Two ways to increase parallelism – Superscaling • Instruction level parallelism – Threading • Thread level parallelism – Concept of Multithreaded processors » May or may not be different than OS level mult-threading • Temporal Multi-threading (also called implicit) – Instructions from only one thread • Simultaneous Multi-threading (explicit) – Instructions from more than one thread can be executed
  13. 13. Scalar Processor Approaches • Single-threaded scalar – Simple pipeline – No multithreading • Interleaved multithreaded scalar – Easiest multithreading to implement – Switch threads at each clock cycle – Pipeline stages kept close to fully occupied – Hardware needs to switch thread context between cycles • Blocked multithreaded scalar – Thread executed until latency event occurs – Would stop pipeline – Processor switches to another thread
  14. 14. Clusters • Alternative to SMP • High performance • High availability • Server applications • A group of interconnected whole computers • Working together as unified resource • Illusion of being one machine • Each computer called a node
  15. 15. Cluster Benefits • Absolute scalability • Incremental scalability • High availability • Superior price/performance
  16. 16. Cluster Configurations - Standby Server, No Shared Disk
  17. 17. Cluster v. SMP • Both provide multiprocessor support to high demand applications. • Both available commercially – SMP for longer • SMP: – Easier to manage and control – Closer to single processor systems • Scheduling is main difference • Less physical space • Lower power consumption • Clustering: – Superior incremental & absolute scalability – Superior availability • Redundancy

×