The Multicore Midlife Crisis       Bogdan Marius Tudor            CSTalks         30 March 2011
Outline•    The Memory Problem•    Do We Need All These Cores?•    Tomorrow’s Multicore•    Research Perspective5/4/11    ...
Remember Single Core?                                 Wikipedia5/4/11                                  3
My Next Processors                     4000                     3000Cache Size [kB]                     2000              ...
My Next Processors                     4000                     3000Cache Size [kB]                     2000              ...
So What?Yeap, they improved the cache size. Do I care?The interesting part is why they did it.5/4/11                      ...
The Memory Problem•  Moore’s Law: the number                    Processor   of transistors double                         ...
The Memory Problem•  Problem: More cores compete for same slow   memory!•  Implications:         IF              IF       ...
The Memory Problem•  Problem: More cores compete for same slow   memory!•  Solution: Increase cache size J         –  Mai...
Increasing Cache Size                                                                    Not practical!         B. M. Roge...
Other Approaches•  Improve memory speed         –  Slow, power-hungry and error-prone•  Better caching•  Improve memory ba...
Do We Need All These Cores?•  Average utilization: < 20%•  We don’t have too many parallel apps•  We just have enough comp...
Tomorrow’s Multicore                                Intel5/4/11                                  13
Tomorrow’s Multicore•  Intel Core i3, i5, i7         –  Video is integrated into CPU         –  Must balance sequential an...
Tomorrow’s Multicore•  The number of cores is becoming less   important         –  They can’t keep increasing them        ...
Tomorrow’s Multicore                                Wikipedia5/4/11                                16
Tomorrow’s Multicore•  The number of cores is becoming less   important         –  They can’t keep increasing them        ...
A Research Perspective•  Coping with heterogeneity is hard         –  Different degrees of parallelism have different     ...
A Research Perspective•  Coping with slow memory•  Need to improve data locality by orders of   magnitude         •  Compi...
A Research Perspective•  Software-helped cache coherence         –  Or go without it J•  Renounce some programming patter...
Discussion         Thank you for your attention5/4/11                                  21
Upcoming SlideShare
Loading in …5
×

CSTalks - The Multicore Midlife Crisis - 30 Mar

905 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
905
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
7
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

CSTalks - The Multicore Midlife Crisis - 30 Mar

  1. 1. The Multicore Midlife Crisis Bogdan Marius Tudor CSTalks 30 March 2011
  2. 2. Outline•  The Memory Problem•  Do We Need All These Cores?•  Tomorrow’s Multicore•  Research Perspective5/4/11 2
  3. 3. Remember Single Core? Wikipedia5/4/11 3
  4. 4. My Next Processors 4000 3000Cache Size [kB] 2000 1000 0 66 200 1000 2250 1600 2400 2400 MHz MHz MHz MHz MHz MHz MHz Apr-94 Apr-98 Nov-01 May-04 Jul-06 Jul-08 Mar-11 5/4/11 4
  5. 5. My Next Processors 4000 3000Cache Size [kB] 2000 1000 0 66 200 1000 2250 1600 2400 2400 MHz MHz MHz MHz MHz MHz MHz Apr-94 Apr-98 Nov-01 May-04 Jul-06 Jul-08 Mar-11 5/4/11 5
  6. 6. So What?Yeap, they improved the cache size. Do I care?The interesting part is why they did it.5/4/11 6
  7. 7. The Memory Problem•  Moore’s Law: the number Processor of transistors double Core Core Core Core every 18 months –  Singlecore: new transistors = faster speed –  Multicore: new transistors Cache = more cores•  Memory speed increase Memory does not obey Moore’s Law!5/4/11 7
  8. 8. The Memory Problem•  Problem: More cores compete for same slow memory!•  Implications: IF IF ID Queue ID ID X Stalled! M access to cache or RAM W J 5 cycles L > 100 cycles5/4/11 8
  9. 9. The Memory Problem•  Problem: More cores compete for same slow memory!•  Solution: Increase cache size J –  Maintain cache hit rate •  2x cache hit rate requires 4x cache size •  Exponential increase in #transistors need –  Cache coherence overhead5/4/11 9
  10. 10. Increasing Cache Size Not practical! B. M. Rogers et al. Scaling the bandwidth wall: challenges in and avenues for CMP scaling. ISCA 20095/4/11 10
  11. 11. Other Approaches•  Improve memory speed –  Slow, power-hungry and error-prone•  Better caching•  Improve memory bandwidth –  Latency tradeoff•  Prefetch –  Mixed blessings•  Allow more in-flight requests5/4/11 11
  12. 12. Do We Need All These Cores?•  Average utilization: < 20%•  We don’t have too many parallel apps•  We just have enough compute power•  Until you try to encode an HD video –  Star Trek holodecks: not there yet•  CPU vendors still have to make a living5/4/11 12
  13. 13. Tomorrow’s Multicore Intel5/4/11 13
  14. 14. Tomorrow’s Multicore•  Intel Core i3, i5, i7 –  Video is integrated into CPU –  Must balance sequential and parallel performance –  Lower energy requirements than prev. generations•  Heterogeneous cores –  Many, slow, good at floating points –  Some general purpose cores –  “Combine” cores into super-cores•  Must live with the memory problems5/4/11 14
  15. 15. Tomorrow’s Multicore•  The number of cores is becoming less important –  They can’t keep increasing them –  i3, i5, i7: how many cores each?5/4/11 15
  16. 16. Tomorrow’s Multicore Wikipedia5/4/11 16
  17. 17. Tomorrow’s Multicore•  The number of cores is becoming less important –  They can’t keep increasing them –  i3, i5, i7: how many cores each?•  Important is what the system provides –  FLOP intensive: GPU-style cores –  I/O intensive: FAWN (CMU) –  Memory intensive: Opteron/Xeon NUMA servers5/4/11 17
  18. 18. A Research Perspective•  Coping with heterogeneity is hard –  Different degrees of parallelism have different sequential executions speeds –  Many tradeoffs: Speed vs. Energy vs. Memory intensity vs. I/O intensity•  Need models for heterogeneity –  Understand the cost of the applications in terms of FLOPS, INTOPS, memory, I/O etc.•  Silver lining: stick to sequential apps (?)5/4/11 18
  19. 19. A Research Perspective•  Coping with slow memory•  Need to improve data locality by orders of magnitude •  Compiler support, auto-tunners etc.•  Space-efficient data types: •  HOT area in algo & systems •  Bloom filters: NSDI’10: 3 papers! •  Succinct data structures: STOC’08-STOC’10 •  Cache oblivious algorithms5/4/11 19
  20. 20. A Research Perspective•  Software-helped cache coherence –  Or go without it J•  Renounce some programming patterns •  Java initializes all objects to some value… •  Rethink those hash tables•  Go for approximate solutions –  It’s better if you can provide error bounds5/4/11 20
  21. 21. Discussion Thank you for your attention5/4/11 21

×