0
Massively Parallel Computing                          CS 264 / CSCI E-292Lecture #1: Introduction | January 25th, 2011    ...
...
Distant Students
Take a picture   with...
a friend           I like
his d                 ogI like
cool       hard              ware
your       m om
Send it to:pinto@mit.edu
Today
Outline
Outline
Massively Parallel Computing                                                  pu ting                                     ...
Massively Parallel Computing                                                  pu ting                                     ...
http://www.youtube.com/watch?v=jj0WsQYtT7M
Modeling & Simulation•   Physics, astronomy, molecular dynamics, finance, etc.•   Data and processing intensive•   Requires...
(20 09)CS 264                Top Dog (2008)•   Roadrunner, LANL    •   #1 on top500.org in 2008 (now #7)    •   1.105 peta...
http://www.top500.org/lists/2010/11
Tianhe-1A at NSC Tianjin   2.507 Petaflop   7168 Tesla M2050 GPUs1 Petaflop/s = ~1M high-end laptops = ~world populationwit...
http://news.cnet.com/8301-13924_3-20021122-64.html
What $100+ million     can buy you...Roadrunner (#7)   Jaguar (#2)
Road     runn          e          r (#7                                         )       http://www.lanl.gov/roadrunner/
(# 2)     gu arJa
C?        HP    ses houW
Wh  o us      es H          PC?
Massively Parallel Computing                                                  pu ting                                     ...
Cloud Computing?
Buzzword ?
Careless Computing?
Response from the legend:...
http://techcrunch.com/2010/12/14/stallman-cloud-computing-careless-computing/
Cloud Utility Computing?         for CS264
http://code.google.com/appengine/
http://aws.amazon.com/ec2/
http://www.nilkanth.com/my-uploads/2008/04/comparingpaas.png
Web Data Explosion
How much Data?•   Google processes 24 PB / day, 8 EB / year (’10)•   Wayback Machine has 3 PB,100 TB/month (’09)•   Facebo...
“640k ought to be enough for anybody.”    - Bill Gates just a rumor (1981)
Disk Throughput•   Average Google job size: 180 GB•   1 SATA HDD = 75 MB / sec•   Time to read 180 GB off disk: 45 mins•  ...
Cloud Computing• Clear trend: centralization of computing  resources in large data centers• Q: What do Oregon, Iceland, an...
Massively Parallel Computing                                                  pu ting                                     ...
Instrument Data                               ExplosionSloan Digital Sky Survey                                    ATLUM /...
Another example?     hint: Switzerland
CERN in 2005....
ool 2005              me r Sch    ERN SumC
ool 2005                  me r Sch    ERN SumC    bad taste party...
ool 2005                   me r Sch    ERN SumC     pitchers...
LHC
LHC      Maximilien Brice, © CERN
LHC      Maximilien Brice, © CERN
N’s Cl usterCER      ~5000 nodes (‘05)
ool 2005                   me r Sch    ERN SumCpresentations...
Slide courtesy of Hanspeter Pfister   Diesel Powered HPC                           Life Support…Murchison Widefield Array
How much Data?•   NOAA has ~1 PB climate data (‘07)•   MWA radio telescope: 8 GB/sec of data•   Connectome: 1 PB / mm3 of ...
High Flops / Watt
Massively Parallel Computing                                                  pu ting                                     ...
Computer Games•   PC gaming business:    •   $15B / year market (2010)    •   $22B / year in 2015 ?    •   WOW: $1B / year...
CryEngine 2, CRYTEK
Many-Core ProcessorsIntel Core i7-980X Extreme                               NVIDIA GTX 580 SC           6 cores          ...
Data Throughput Massive   Data                                GPUParallelismInstruction   Level       CPUParallelism      ...
3 of Top5 Supercomputers             $"!!             $!!!             #"!! !"#$%&()             #!!!              "!!    ...
Personal Supercomputers~4 Teraflops@ 1500 Watts
Disruptive Technologies•   Utility computing    •   Commodity off-the-shelf (COTS) hardware    •   Compute servers with 10...
Green HPCNVIDIA/NCSA Green 500 Entry
Green HPCNVIDIA/NCSA Green 500 Entry 128 nodes, each with:    1x Core i3 530 (2 cores, 2.93 GHz => 23.4 GFLOP peak)    1x ...
One more thing...
Massively Parallel Computing                                                  pu ting                                     ...
Massively Parallel Computing                                                  pu ting                                     ...
Massively Parallel Human     Computing ???•   “Crowdsourcing”•   Amazon Mechanical Turk    (artificial artificial intelligen...
What is this course about?
What is this course about?  Massively parallel processors  •   GPU computing with CUDA  Cloud computing  •   Amazon’s EC2 ...
Less like Rodin...
More like Bob...
Outline
wikipedia.org
Anant Agarwal, MIT
Power Cost• Power ∝ Voltage2 x Frequency• Frequency ∝ Voltage• Power ∝ Frequency3                                 Jack Don...
Power Cost            Cores   Freq   Perf   Power   P/W  CPU        1       1      1       1      1“New” CPU    1      1.5...
Problem with Buses                     Anant Agarwal, MIT
Problem with Memory                http://www.OpenSparc.net/
Problem with Disks               64 MB / sec                     Tom’s Hardware
Good News•   Moore’s Law marches on•   Chip real-estate is essentially free•   Many-core architectures are commodities•   ...
Bad News•   Power limits improvements in clock speed•   Parallelism is the only route to improve    performance•   Computa...
BadNews
A “Simple” Matter of          Software•   We have to use all the cores efficiently•   Careful data and memory management•  ...
Our mantra: always use the right tool !
Outline
Instructor: Nicolas Pinto            The Rowland Institute at Harvard            HARVARD UNIVERSITY
~50% of   is for vision!
Everyone knows  that...
The ApproachReverse and Forward Engineering the Brain
The ApproachReverse and Forward Engineering the Brain     REVERSE                 FORWARD       Study                     ...
t aflo ps !      in =2 0 pebra
http://vimeo.com/7945275
“   If you want to have good ideas         you must have many ideas.               ”    “  Most of them will be wrong,    ...
High-throughput       Screening
The curse of speed...and the blessing of massively parallel computing   thousands of big models   large amounts of unsuper...
The curse of speed...and the blessing of massively parallel computing  No off-the-shelf solution? DIY!  Engineering (Hardw...
r ow n! u ild youB
The blessing of GPUs    DIY GPU pr0n (since 2006)   Sony Playstation 3s (since 2007)
speed                 (in billion floating point operations per second)    Q9450 (Matlab/C) [2008]    0.3        Q9450 (C/S...
Tired Of Waiting For YourComputations?                           n your deskto                                            ...
Co-Instructor:Hanspeter Pfister
Visual Computing•   Large image & video collections•   Physically-based modeling•   Face modeling and recognition•   Visua...
VolumePro 500                Released                 1999
GPGPU
Connectome
NSF CDI Grant ’08-’11
NVIDIA CUDA Center   of Excellence
TFs•   Claudio Andreoni (MIT Course 18)•   Dwight Bell (Harvard DCE)•   Krunal Patel (Accelereyes)•   Jud Porter (Harvard ...
Claudio Andreoni(MIT Course 18)
Dwight Bell(Harvard DCE)
Krunal Patel(Accelereyes)
Jud Porter(Harvard SEAS)
Justin Riley(MIT OEIT)
Mike Roberts(Harvard SEAS)
About You
About you...•   Undergraduate ? Graduate ?•   Programming ? >5 years ? <2 years ?•   CUDA ? MPI ? MapReduce ?•   CS ? Life...
Outline
CS 264 Goals•   Have fun!•   Learn basic principles of parallel computing•   Learn programming with CUDA•   Learn to progr...
Experimental Learning         t,   re pe                                            epea                  Strategy        ...
Lectures•Theory, Architecture, Patterns ?• Act 1: GPU Computing• Act II: Cloud Computing• Act III: Guest Lectures
Lectures “Format”• 2x ~ 45min regular “lectures”• ~ 15min “Clinic” •   we’ll be here to fix your problems• ~ 5 min: Life an...
Act I: GPU Computing•   Introduction to GPU Computing•   CUDA Basics•   CUDA Advanced•   CUDA Ninja Tricks !
l u t i on                                                                n   k   Convo                                   ...
Empirical results...                                             Performance (g ops) Q9450 (Matlab/C) [2008]    0.3     Q9...
Act II: Cloud Computing•   Introduction to utility computing•   EC2 & starcluster (Justin Riley, MIT OEIT)•   Hadoop (Zak ...
Amazon’s Web Services•   Elastic Compute Cloud (EC2)    •   Rent computing resources by the hour    •   Basic unit of acco...
MapReduce•   Functional programming meets distributed    processing•   Processing of lists with <key, value> pairs•   Batc...
Act III: Guest Lectures•   Andreas Knockler (NYU): OpenCL & PyOpenCL•   John Owens (UC Davis): fundamental algorithms/    ...
Labs•   Lead by TF(s)•   Work on an interesting small problem•   From skeleton code to solution•   Hands-on
53 Church St.
53 Church St.
53 Church St.
53 Church St., Room 10453 Church St., Rm 104   Thu, Fri 7.35-9.35 pm
53 Church St., Room 10553 Church St., Rm 105
NVIDIA Fx4800 Quadro             •   MacPro             •   NVIDIA Fx4800                 Quadro, 1.5 GB
Resonance @ SEAS           •   Quad-core Intel Xeon               host, 3 GHz, 8 GB           •   8 Tesla S1070s (32      ...
What do you            need to know?•   Programming (ideally in C / C++)    •   See HW 0•   Basics of computer systems    ...
Homeworks•   Programming assignments•   “Issue Spotter” (code debug & review, Q&A)•   Contribution to the community    (OS...
Office Hours•   Lead by a TF•   104 @ 53 Church St    (check website and news feed)
Participation•   HW0 (this week)•   Mandatory attendance for guest lectures•   forum.cs264.org    •   Answer questions, he...
Final Project•   Implement a substantial project•   Pick from a list of suggested projects or design    your own•   Milest...
Grading•   On a 0-100 scale    •   Participation:   10%    •   Homework:        50%    •   Final project:   40%
www.cs264.org•   Detailed schedule (soon)•   News blog w/ RSS feed•   Video feeds•   Forum (forum.cs264.org)•   Academic h...
Thank you!
one more thing         from WikiLeaks?
Is this course for me ???
This course is not for you...  •   If you’re not genuinely interested in the topic  •   If you can’t cope with uncertainly...
Otherwise...It will be a richly rewarding experience!
Guaranteed?!
Be Patient  Be FlexibleBe Constructive                  http://davidzinger.wordpress.com/2007/05/page/2/
It would be a                                                       win-win-win situation!(The Office Season 2, Episode 27:...
Hypergrowth ?
Acknowledgements•   Hanspeter Pfister & Henry Leitner, DCE•   TFs•   Rob Parrott & IT Team, SEAS•   Gabe Russell & Video Te...
CO ME
Next?•   Fill out the survey: http://bit.ly/enrb1r•   Get ready for HW0 (Lab 1 & 2)•   Subscribe to http://forum.cs264.org...
[Harvard CS264] 01 - Introduction
[Harvard CS264] 01 - Introduction
[Harvard CS264] 01 - Introduction
[Harvard CS264] 01 - Introduction
[Harvard CS264] 01 - Introduction
[Harvard CS264] 01 - Introduction
[Harvard CS264] 01 - Introduction
[Harvard CS264] 01 - Introduction
[Harvard CS264] 01 - Introduction
[Harvard CS264] 01 - Introduction
[Harvard CS264] 01 - Introduction
[Harvard CS264] 01 - Introduction
[Harvard CS264] 01 - Introduction
[Harvard CS264] 01 - Introduction
[Harvard CS264] 01 - Introduction
[Harvard CS264] 01 - Introduction
[Harvard CS264] 01 - Introduction
[Harvard CS264] 01 - Introduction
[Harvard CS264] 01 - Introduction
[Harvard CS264] 01 - Introduction
Upcoming SlideShare
Loading in...5
×

[Harvard CS264] 01 - Introduction

3,380

Published on

http://cs264.org

Published in: Education, Technology, Business
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,380
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
120
Comments
0
Likes
5
Embeds 0
No embeds

No notes for slide

Transcript of "[Harvard CS264] 01 - Introduction"

  1. 1. Massively Parallel Computing CS 264 / CSCI E-292Lecture #1: Introduction | January 25th, 2011 Nicolas Pinto (MIT, Harvard) pinto@mit.edu
  2. 2. ...
  3. 3. Distant Students
  4. 4. Take a picture with...
  5. 5. a friend I like
  6. 6. his d ogI like
  7. 7. cool hard ware
  8. 8. your m om
  9. 9. Send it to:pinto@mit.edu
  10. 10. Today
  11. 11. Outline
  12. 12. Outline
  13. 13. Massively Parallel Computing pu ting om Supe r eC rcom putin any -co g M MPC H igh-T uting hrou ghpu p t Co om Hu mput dC Clou ma ing n? “C om pu tin g”
  14. 14. Massively Parallel Computing pu ting om Supe r eC rcom putin any -co g M MPC H igh-T uting hrou ghpu p t Co om Hu mput dC Clou ma ing n? “C om pu tin g”
  15. 15. http://www.youtube.com/watch?v=jj0WsQYtT7M
  16. 16. Modeling & Simulation• Physics, astronomy, molecular dynamics, finance, etc.• Data and processing intensive• Requires high-performance computing (HPC)• Driving HPC architecture development
  17. 17. (20 09)CS 264 Top Dog (2008)• Roadrunner, LANL • #1 on top500.org in 2008 (now #7) • 1.105 petaflop/s • 3000 nodes with dual-core AMD Opteron processors • Each node connected via PCIe to two IBM Cell processors • Nodes are connected via Infiniband 4x DDR
  18. 18. http://www.top500.org/lists/2010/11
  19. 19. Tianhe-1A at NSC Tianjin 2.507 Petaflop 7168 Tesla M2050 GPUs1 Petaflop/s = ~1M high-end laptops = ~world populationwith hand calculators 24/7/365 for ~16 years Slide courtesy of Bill Dally (NVIDIA)
  20. 20. http://news.cnet.com/8301-13924_3-20021122-64.html
  21. 21. What $100+ million can buy you...Roadrunner (#7) Jaguar (#2)
  22. 22. Road runn e r (#7 ) http://www.lanl.gov/roadrunner/
  23. 23. (# 2) gu arJa
  24. 24. C? HP ses houW
  25. 25. Wh o us es H PC?
  26. 26. Massively Parallel Computing pu ting om Supe r eC rcom putin any -co g M MPC H igh-T uting hrou ghpu p t Co om Hu mput dC Clou ma ing n? “C om pu tin g”
  27. 27. Cloud Computing?
  28. 28. Buzzword ?
  29. 29. Careless Computing?
  30. 30. Response from the legend:...
  31. 31. http://techcrunch.com/2010/12/14/stallman-cloud-computing-careless-computing/
  32. 32. Cloud Utility Computing? for CS264
  33. 33. http://code.google.com/appengine/
  34. 34. http://aws.amazon.com/ec2/
  35. 35. http://www.nilkanth.com/my-uploads/2008/04/comparingpaas.png
  36. 36. Web Data Explosion
  37. 37. How much Data?• Google processes 24 PB / day, 8 EB / year (’10)• Wayback Machine has 3 PB,100 TB/month (’09)• Facebook user data: 2.5 PB, 15 TB/day (’09)• Facebook photos: 15 B, 3 TB/day (’09) - 90 B (now)• eBay user data: 6.5 PB, 50 TB/day (’09)• “all words ever spoken by human beings”~ 42 ZB Adapted from http://www.umiacs.umd.edu/~jimmylin/cloud-2010-Spring/
  38. 38. “640k ought to be enough for anybody.” - Bill Gates just a rumor (1981)
  39. 39. Disk Throughput• Average Google job size: 180 GB• 1 SATA HDD = 75 MB / sec• Time to read 180 GB off disk: 45 mins• Solution: parallel reads• 1000 HDDs = 75 GB / sec• Google’s solutions: BigTable, MapReduce, etc.
  40. 40. Cloud Computing• Clear trend: centralization of computing resources in large data centers• Q: What do Oregon, Iceland, and abandoned mines have in common?• A: Fiber, juice, and space• Utility computing!
  41. 41. Massively Parallel Computing pu ting om Supe r eC rcom putin any -co g M MPC H igh-T uting hrou ghpu p t Co om Hu mput dC Clou ma ing n? “C om pu tin g”
  42. 42. Instrument Data ExplosionSloan Digital Sky Survey ATLUM / Connectome Project
  43. 43. Another example? hint: Switzerland
  44. 44. CERN in 2005....
  45. 45. ool 2005 me r Sch ERN SumC
  46. 46. ool 2005 me r Sch ERN SumC bad taste party...
  47. 47. ool 2005 me r Sch ERN SumC pitchers...
  48. 48. LHC
  49. 49. LHC Maximilien Brice, © CERN
  50. 50. LHC Maximilien Brice, © CERN
  51. 51. N’s Cl usterCER ~5000 nodes (‘05)
  52. 52. ool 2005 me r Sch ERN SumCpresentations...
  53. 53. Slide courtesy of Hanspeter Pfister Diesel Powered HPC Life Support…Murchison Widefield Array
  54. 54. How much Data?• NOAA has ~1 PB climate data (‘07)• MWA radio telescope: 8 GB/sec of data• Connectome: 1 PB / mm3 of brain tissue (1 EB for 1 cm3)• CERN’s LHC will generate 15 PB a year (‘08)
  55. 55. High Flops / Watt
  56. 56. Massively Parallel Computing pu ting om Supe r eC rcom putin any -co g M MPC H igh-T uting hrou ghpu p t Co om Hu mput dC Clou ma ing n? “C om pu tin g”
  57. 57. Computer Games• PC gaming business: • $15B / year market (2010) • $22B / year in 2015 ? • WOW: $1B / year• NVIDIA Shipped 1B GPUs since 1993: • 10 years to ship 200M GPUs (1993-2003)• 1/3 of all PCs have more than one GPU• High-end GPUs sell for around $300• Now used for science application
  58. 58. CryEngine 2, CRYTEK
  59. 59. Many-Core ProcessorsIntel Core i7-980X Extreme NVIDIA GTX 580 SC 6 cores 512 cores 1.17B transistors 3B transistors http://en.wikipedia.org/wiki/Transistor_count
  60. 60. Data Throughput Massive Data GPUParallelismInstruction Level CPUParallelism Data Fits in Cache Huge Data David Kirk, NVIDIA
  61. 61. 3 of Top5 Supercomputers $"!! $!!! #"!! !"#$%&() #!!! "!! ! %&()*+#, -./0 1*2/3* %4/25* 6788*09:: Bill Dally, NVIDIA
  62. 62. Personal Supercomputers~4 Teraflops@ 1500 Watts
  63. 63. Disruptive Technologies• Utility computing • Commodity off-the-shelf (COTS) hardware • Compute servers with 100s-1000s of processors• High-throughput computing • Mass-market hardware • Many-core processors with 100s-1000s of cores • High compute density / high flops/W
  64. 64. Green HPCNVIDIA/NCSA Green 500 Entry
  65. 65. Green HPCNVIDIA/NCSA Green 500 Entry 128 nodes, each with: 1x Core i3 530 (2 cores, 2.93 GHz => 23.4 GFLOP peak) 1x Tesla C2050 (14 cores, 1.15 GHz => 515.2 GFLOP peak) 4x QDR Infiniband 4 GB DRAM Theoretical Peak Perf: 68.95 TF Footprint: ~20 ft^2 => 3.45 TF/ft^2 Cost: $500K (street price) => 137.9 MF/$ Linpack: 33.62 TF, 36.0 kW => 934 MF/W
  66. 66. One more thing...
  67. 67. Massively Parallel Computing pu ting om Supe r eC rcom putin any -co g M MPC H igh-T uting hrou ghpu p t Co om Hu mput dC Clou ma ing n? “C om pu tin g”
  68. 68. Massively Parallel Computing pu ting om Supe r eC rcom putin any -co g M MPC H igh-T uting hrou ghpu p t Co om Hu mput dC Clou ma ing n? “C om pu tin g”
  69. 69. Massively Parallel Human Computing ???• “Crowdsourcing”• Amazon Mechanical Turk (artificial artificial intelligence)• Wikipedia• Stackoverflow• etc.
  70. 70. What is this course about?
  71. 71. What is this course about? Massively parallel processors • GPU computing with CUDA Cloud computing • Amazon’s EC2 as an example of utility computing • MapReduce, the “back-end” of cloud computing
  72. 72. Less like Rodin...
  73. 73. More like Bob...
  74. 74. Outline
  75. 75. wikipedia.org
  76. 76. Anant Agarwal, MIT
  77. 77. Power Cost• Power ∝ Voltage2 x Frequency• Frequency ∝ Voltage• Power ∝ Frequency3 Jack Dongarra
  78. 78. Power Cost Cores Freq Perf Power P/W CPU 1 1 1 1 1“New” CPU 1 1.5 1.5 3.3 0.45xMulticore 2 0.75 1.5 0.8 1.88x Jack Dongarra
  79. 79. Problem with Buses Anant Agarwal, MIT
  80. 80. Problem with Memory http://www.OpenSparc.net/
  81. 81. Problem with Disks 64 MB / sec Tom’s Hardware
  82. 82. Good News• Moore’s Law marches on• Chip real-estate is essentially free• Many-core architectures are commodities• Space for new innovations
  83. 83. Bad News• Power limits improvements in clock speed• Parallelism is the only route to improve performance• Computation / communication ratio will get worse• More frequent hardware failures?
  84. 84. BadNews
  85. 85. A “Simple” Matter of Software• We have to use all the cores efficiently• Careful data and memory management• Must rethink software design• Must rethink algorithms• Must learn new skills!• Must learn new strategies!• Must learn new tools...
  86. 86. Our mantra: always use the right tool !
  87. 87. Outline
  88. 88. Instructor: Nicolas Pinto The Rowland Institute at Harvard HARVARD UNIVERSITY
  89. 89. ~50% of is for vision!
  90. 90. Everyone knows that...
  91. 91. The ApproachReverse and Forward Engineering the Brain
  92. 92. The ApproachReverse and Forward Engineering the Brain REVERSE FORWARD Study Build Natural System Artificial System
  93. 93. t aflo ps ! in =2 0 pebra
  94. 94. http://vimeo.com/7945275
  95. 95. “ If you want to have good ideas you must have many ideas. ” “ Most of them will be wrong, and what you have to learn is which ones to throw away. ” Linus Pauling (double Nobel Prize Winner)
  96. 96. High-throughput Screening
  97. 97. The curse of speed...and the blessing of massively parallel computing thousands of big models large amounts of unsupervised learning experience
  98. 98. The curse of speed...and the blessing of massively parallel computing No off-the-shelf solution? DIY! Engineering (Hardware/SysAdmin/Software) Science Leverage non-scientific high-tech markets and their $billions of R&D... Gaming: Graphics Cards (GPUs), PlayStation 3 Web 2.0: Cloud Computing (Amazon, Google)
  99. 99. r ow n! u ild youB
  100. 100. The blessing of GPUs DIY GPU pr0n (since 2006) Sony Playstation 3s (since 2007)
  101. 101. speed (in billion floating point operations per second) Q9450 (Matlab/C) [2008] 0.3 Q9450 (C/SSE) [2008] 9.07900GTX (OpenGL/Cg) [2006] 68.2 PS3/Cell (C/ASM) [2007] 111.4 8800GTX (CUDA1.x) [2007] 192.7 GTX280 (CUDA2.x) [2008] 339.3 cha n ging... e GTX480 (CUDA3.x) [2010] pe edu p is g a m 974.3 (Fermi) >1 000X s Pinto, Doukhan, DiCarlo, Cox PLoS 2009 Pinto, Cox GPU Comp. Gems 2011
  102. 102. Tired Of Waiting For YourComputations? n your deskto p: go Supercomputin n of c h e a p a n d eneratio Prog ramm ing the next g sing CUDA all el hardware u massively par extensive g ive students designed to disruptive This IA P has been ne w potentially e in using a ses having ha nds- on experienc ab les the mas echnology en techno logy. This t apabilities. rcomputing c access to supe rog ramming the CUDA p e students to orp. which, has been an We will introduc NVIDIA C developed by u n if y in g t h e lan guage p li fy in g a n d t o w a rd s s im s e n t ia l s t e p el chips. es of m assively parall prog ramming ions from nero us contribut orted by ge te at Harvard , and MIT This IAP is supp stitu e Rowland In s given by NVID IA Corp. , Th e featuring talk ) and will b (OEIT , BCS, EECS . various fields experts from (IAP 09) 6. 963
  103. 103. Co-Instructor:Hanspeter Pfister
  104. 104. Visual Computing• Large image & video collections• Physically-based modeling• Face modeling and recognition• Visualization
  105. 105. VolumePro 500 Released 1999
  106. 106. GPGPU
  107. 107. Connectome
  108. 108. NSF CDI Grant ’08-’11
  109. 109. NVIDIA CUDA Center of Excellence
  110. 110. TFs• Claudio Andreoni (MIT Course 18)• Dwight Bell (Harvard DCE)• Krunal Patel (Accelereyes)• Jud Porter (Harvard SEAS)• Justin Riley (MIT OEIT)• Mike Roberts (Harvard SEAS)
  111. 111. Claudio Andreoni(MIT Course 18)
  112. 112. Dwight Bell(Harvard DCE)
  113. 113. Krunal Patel(Accelereyes)
  114. 114. Jud Porter(Harvard SEAS)
  115. 115. Justin Riley(MIT OEIT)
  116. 116. Mike Roberts(Harvard SEAS)
  117. 117. About You
  118. 118. About you...• Undergraduate ? Graduate ?• Programming ? >5 years ? <2 years ?• CUDA ? MPI ? MapReduce ?• CS ? Life Sc ? Applied Sc ? Engineering ? Math ? Physics ?• Humanities ? Social Sc ? Economy ?
  119. 119. Outline
  120. 120. CS 264 Goals• Have fun!• Learn basic principles of parallel computing• Learn programming with CUDA• Learn to program a cluster of GPUs (e.g. MPI)• Learn basics of EC2 and MapReduce• Learn new learning strategies, tools, etc.• Implement a final project
  121. 121. Experimental Learning t, re pe epea Strategy peat,r ReMemory “recall”
  122. 122. Lectures•Theory, Architecture, Patterns ?• Act 1: GPU Computing• Act II: Cloud Computing• Act III: Guest Lectures
  123. 123. Lectures “Format”• 2x ~ 45min regular “lectures”• ~ 15min “Clinic” • we’ll be here to fix your problems• ~ 5 min: Life and Code “Hacking”: • GTD Zen • Presentation Zen • Ninja Programming Tricks & Tools, etc. • Interested? email staff+spotlight@cs264.org
  124. 124. Act I: GPU Computing• Introduction to GPU Computing• CUDA Basics• CUDA Advanced• CUDA Ninja Tricks !
  125. 125. l u t i on n k Convo Fi lterbaPerformance / Effort 3D Performance (g ops) Development Time (hours) 0.3Matlab 0.5 9.0 C/SSE 10.0 111.4 PS3 30.0 339.3GT200 10.0
  126. 126. Empirical results... Performance (g ops) Q9450 (Matlab/C) [2008] 0.3 Q9450 (C/SSE) [2008] 9.0 7900GTX (Cg) [2006] 68.2 PS3/Cell (C/ASM) [2007] 111.48800GTX (CUDA1.x) [2007] 192.7 GTX280 (CUDA2.x) [2008] 339.3 . GTX480 (CUDA3.x) [2010] e cha nging.. 974.3 g am e edup is >1 0 00X sp
  127. 127. Act II: Cloud Computing• Introduction to utility computing• EC2 & starcluster (Justin Riley, MIT OEIT)• Hadoop (Zak Stone, SEAS)• MapReduce with GPU Jobs on EC2
  128. 128. Amazon’s Web Services• Elastic Compute Cloud (EC2) • Rent computing resources by the hour • Basic unit of accounting = instance-hour • Additional costs for bandwidth• You’ll be getting free AWS credits for course assignments
  129. 129. MapReduce• Functional programming meets distributed processing• Processing of lists with <key, value> pairs• Batch data processing infrastructure• Move the computation where the data is
  130. 130. Act III: Guest Lectures• Andreas Knockler (NYU): OpenCL & PyOpenCL• John Owens (UC Davis): fundamental algorithms/ data structures and irregular parallelism• Nathan Bell (NVIDIA): Thrust• Duane Merrill* (Virginia Tech): Ninja Tricks• Mike Bauer* (Stanford): Sequoia• Greg Diamos (Georgia Tech): Ocelot• Other lecturers* from Google,Yahoo, Sun, Intel, NCSA, AMD, Cloudera, etc.
  131. 131. Labs• Lead by TF(s)• Work on an interesting small problem• From skeleton code to solution• Hands-on
  132. 132. 53 Church St.
  133. 133. 53 Church St.
  134. 134. 53 Church St.
  135. 135. 53 Church St., Room 10453 Church St., Rm 104 Thu, Fri 7.35-9.35 pm
  136. 136. 53 Church St., Room 10553 Church St., Rm 105
  137. 137. NVIDIA Fx4800 Quadro • MacPro • NVIDIA Fx4800 Quadro, 1.5 GB
  138. 138. Resonance @ SEAS • Quad-core Intel Xeon host, 3 GHz, 8 GB • 8 Tesla S1070s (32 GPUs, 4 GB each) • 16 quad-core Intel Xeons, 2 GHz, 16 GB • http:// community.crimsongri d.harvard.edu/getting- started/resources/ resonance-cuda-host
  139. 139. What do you need to know?• Programming (ideally in C / C++) • See HW 0• Basics of computer systems • CS 61 or similar
  140. 140. Homeworks• Programming assignments• “Issue Spotter” (code debug & review, Q&A)• Contribution to the community (OSS, Wikipedia, Stackoverflow, etc.)• Due: Fridays at 11 pm EST • Hard deadline - 2 “bonus” days
  141. 141. Office Hours• Lead by a TF• 104 @ 53 Church St (check website and news feed)
  142. 142. Participation• HW0 (this week)• Mandatory attendance for guest lectures• forum.cs264.org • Answer questions, help others • Post relevant links and discussions (!)
  143. 143. Final Project• Implement a substantial project• Pick from a list of suggested projects or design your own• Milestones along the way (idea, proposal, etc.)• In-class final presentations• $500+ price for the best project
  144. 144. Grading• On a 0-100 scale • Participation: 10% • Homework: 50% • Final project: 40%
  145. 145. www.cs264.org• Detailed schedule (soon)• News blog w/ RSS feed• Video feeds• Forum (forum.cs264.org)• Academic honesty policy• HW0 (due Fri 2/4)
  146. 146. Thank you!
  147. 147. one more thing from WikiLeaks?
  148. 148. Is this course for me ???
  149. 149. This course is not for you... • If you’re not genuinely interested in the topic • If you can’t cope with uncertainly, unpredictability, poor documentation, and immature software • If you’re not ready to do a lot of programming • If you’re not open to thinking about computing in new ways • If you can’t put in the time Slide after Jimmy Lin, iSchool, Maryland
  150. 150. Otherwise...It will be a richly rewarding experience!
  151. 151. Guaranteed?!
  152. 152. Be Patient Be FlexibleBe Constructive http://davidzinger.wordpress.com/2007/05/page/2/
  153. 153. It would be a win-win-win situation!(The Office Season 2, Episode 27: Conflict Resolution)
  154. 154. Hypergrowth ?
  155. 155. Acknowledgements• Hanspeter Pfister & Henry Leitner, DCE• TFs• Rob Parrott & IT Team, SEAS• Gabe Russell & Video Team, DCE• NVIDIA, esp. David Luebke• Amazon
  156. 156. CO ME
  157. 157. Next?• Fill out the survey: http://bit.ly/enrb1r• Get ready for HW0 (Lab 1 & 2)• Subscribe to http://forum.cs264.org• Subscribe to RSS feed: http://bit.ly/eFIsqR
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×