Your SlideShare is downloading. ×
Lecture4
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Lecture4

1,030
views

Published on

Published in: Technology

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,030
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
24
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. High Performance Computing Jawwad Shamsi Lecture #4 25th January 2010
  • 2. Recap • Pipelining • Super Scalar Execution – Dependencies • Branch • Data • Resource • Effect of Latency and Memory • Effect of Parallelism
  • 3. Flynn’s taxanomy It distinguishes multiprocessor computers according to data and instruction • the dimensions of Instruction and Data • SISD: Single Instruction Single data (Uniprocessor) • SIMD: Single Instruction Multiple data (Vector Processing) • MISD: Multiple Instruction Single date • MIMD: Multiple Instruction Multiple data (SMP, cluster, NUMA)
  • 4. MIMD • MIMD – Shared Memory (tightly coupled) • SMP (Symmetric Multiprocessing) • Non-Uniform Memory access – Distributed Memory (loosely coupled) • Clusters
  • 5. Taxonomy of Parallel Processor Architectures
  • 6. Shared Address Space • Shared Memory • Distributed Memory
  • 7. SMP • Two or more similar processors • Same main memory and I/O • Can perform similar operations • Share access to I/O devices
  • 8. Multiprogramming and Multiprocessing
  • 9. SMP Advantages • Performance • Availability • Incremental growth • Scaling
  • 10. Block Diagram of Tightly Coupled Multiprocessor
  • 11. Cache Coherence • Multiple copies of cache can maintain different data – Protocols?
  • 12. Processor Design: Modes of Parallelism • Two ways to increase parallelism – Superscaling • Instruction level parallelism – Threading • Thread level parallelism – Concept of Multithreaded processors » May or may not be different than OS level mult-threading • Temporal Multi-threading (also called implicit) – Instructions from only one thread • Simultaneous Multi-threading (explicit) – Instructions from more than one thread can be executed
  • 13. Scalar Processor Approaches • Single-threaded scalar – Simple pipeline – No multithreading • Interleaved multithreaded scalar – Easiest multithreading to implement – Switch threads at each clock cycle – Pipeline stages kept close to fully occupied – Hardware needs to switch thread context between cycles • Blocked multithreaded scalar – Thread executed until latency event occurs – Would stop pipeline – Processor switches to another thread
  • 14. Clusters • Alternative to SMP • High performance • High availability • Server applications • A group of interconnected whole computers • Working together as unified resource • Illusion of being one machine • Each computer called a node
  • 15. Cluster Benefits • Absolute scalability • Incremental scalability • High availability • Superior price/performance
  • 16. Cluster Configurations - Standby Server, No Shared Disk
  • 17. Cluster v. SMP • Both provide multiprocessor support to high demand applications. • Both available commercially – SMP for longer • SMP: – Easier to manage and control – Closer to single processor systems • Scheduling is main difference • Less physical space • Lower power consumption • Clustering: – Superior incremental & absolute scalability – Superior availability • Redundancy