Elster falch-gpu-cse-sem-oct2013

769 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
769
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Elster falch-gpu-cse-sem-oct2013

  1. 1. 1 http://research.idi.ntnu.no/hpc-lab The Power of GPU Computing Thomas L. Falch and Dr. Anne C. Elster(*) HPC-Lab, Dept. Computer and Info. Science Norwegian University of Science & Technology Trondheim, Norway (*)Elster also holds a 0% Visiting Scientist appointment at ICES, UT Austin where she spends summer and sabatticals Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  2. 2. 2 Thank yous to: http://research.idi.ntnu.no/hpc-lab HPC-Lab Post Docs and grad. students! HPC-Lab 2012 Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  3. 3. 3 Dr. Elster´s HPC-Lab currently focuses on research related to novel GPU and multi-core architectures http://research.idi.ntnu.no/hpc-lab > 40 Master students (since 2001) > 15 masters projects on GPU for HPC Parallelization of Seismic and Image Related Applications on GPUs and Multi-Cores Modeling Heterogenous systems Parallel and Distributed Algorithms and Tools Performance Evaluation and Benchmarking Collaborators / Supporters: Adaptive and Auto-Tuneable Algorithms and Implementations NTNU CSE Seminar Oct 2, 2013 AMD, ARM, CERN, NVIDIA, Statoil, GPU Computing Falch & Elster: The Power of Schlumberger, GE-Healthcare, and others
  4. 4. 4 Outline http://research.idi.ntnu.no/hpc-lab • Introduction to GPU computing • Overview of GPU projects at the HPC-Lab – 3D Real-Time Snow simulation – Flow simulations (porous media) – Surface extraction • Visualization of scattered point data Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  5. 5. 5 http://research.idi.ntnu.no/hpc-lab Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  6. 6. 6 http://research.idi.ntnu.no/hpc-lab Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  7. 7. 7 The “Walls” (refr. Dr. David Patterson) http://research.idi.ntnu.no/hpc-lab To increase processor performance one can: 1. Increase the system clock speed -> Power Wall(*) 2. Increase memory bandwidth-> more complex 3. Parallelize -> more complex (*) The Power Wall: Too much Heat and transistor performance degrades (more power leakage as power increases)!  Now maxing out at 3-4GHz for general processors Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  8. 8. 8 http://research.idi.ntnu.no/hpc-lab Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  9. 9. The TOP10 (June 2013) Rank Site Manufacturer 1 National University of Defense Technology NUDT 2 DOE/SC/Oak Ridge National Laboratory Cray 3 DOE/NNSA/LLNL IBM 4 5 RIKEN Advanced Institute for Computational Science (AICS) DOE/SC/Argonne National Laboratory Computer Country Cores Rmax [Tflops] Power [MW] Tianhe-2 (MilkyWay-2) - THIVB-FEP Cluster, Intel Xeon E5-2692 12C 2.200GHz, TH China 3,120,000 33,862.7 17.8 Express-2, Intel Xeon Phi 31S1P Titan - Cray XK7 , Opteron 6274 16C 2.200GHz, Cray USA 560,640 17,590.0 8.2 Gemini interconnect, NVIDIA K20x Sequoia - BlueGene/Q, Power USA 1,572,864 17,173.2 7.9 BQC 16C 1.60 GHz, Custom Fujitsu K computer, SPARC64 VIIIfx 2.0GHz, Tofu interconnect Japan IBM Mira - BlueGene/Q, Power BQC 16C 1.60GHz, Custom USA 705,024 10,510.0 12.7 786,432 8,586.6 3.9
  10. 10. 10 Intel´s Xeon Phi http://research.idi.ntnu.no/hpc-lab (aka MIC, Knights Ferry/Knights Corner, Larrabee) Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  11. 11. 11 How to get to Exascale? http://research.idi.ntnu.no/hpc-lab Limited by Power! Solution? (BOF at SC´10 by Elster, Vaquez-Poletti & Perhac: Towards Exa-Scale: Heterogeneous Clouds (CPUs, GPUs and Embedded Devices) Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  12. 12. 12 http://research.idi.ntnu.no/hpc-lab Outline • Introduction to GPU computing • Overview of GPU projects at the HPC-Lab – 3D Real-Time Snow simulation – Flow simulations (porous media) – Surface extraction • Visualization of scattered point data Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  13. 13. 13 NTNU GPU Activities http://research.idi.ntnu.no/hpc-lab NTNU is a NVIDIA CUDA Research & Teaching Center • Elster teaches a Senior Parallel computing class with 50+ students • Elster´s HPC-lab has graduated 20+ Master students in GPU computing (2007-2013) Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  14. 14. 14 HPC-Lab History (last 6 yrs): http://research.idi.ntnu.no/hpc-lab Fall 2006: • First 2 student projects with GPU programming (Cg) Christian Larsen (MS Fall Project, December 2006): “Utilizing GPUs on Cluster Computers” (joint with Schlumberger) Erik Axel Nielsen asks for FX 4800 card for project with GE Healthcare • Elster head of Computational Science & Visualization program and helped NTNU acquire new IBM Supercomputer (Njord, 7+ TFLOPS, proprietary switch) 14 Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  15. 15. 15 HPC-Lab History (contin.): http://research.idi.ntnu.no/hpc-lab 2007: Erik Axel Nielsen (Masters thesis, June 2007): “Real-time Wavelet Filtering on the GPU” -- joint project with GE Healthcare. 40 times GPU speedup of algorithm led to our implementation being adopted the same fall in their high-end cardivascular ultrasound scanner. Christian Larsen (Masters thesis, June 2007) Tore Fevang, Schlumberger (co-advisor): "Framework for Polygonial Structures Computations on Clusters” (incl GPU parallelization) Idar Borlaug (Masters thesis, June 2007): “ Seismic Processing Using Parallel 3D FMM” Thibault Collet (Masters thesis summer 2007): "Massively Online Games with Food Chains" Knut Imar Hagen (Masters thesis, June 2007) “Fault-tolerance for MPI Codes on Computation Clusters” (joint project with Statoil) Nils Magnus Larsgård (Masters thesis summer 2007): “Framework for Converting MPI Codes to Hybrid OpenMP/MPI Codes” 15 Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  16. 16. 16 HPC-Lab History (contin.): http://research.idi.ntnu.no/hpc-lab 2008: • Quadcore Supercomputer at UiTø (Stallo) ca. 70 TF • HPC-LAB at IDI/NTNU opens in Oct. with • several NVIDIA donation • Several quad-core machines (1-2 donated by Schlumberger) 16 Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  17. 17. 17 HPC-Lab History (contin.): http://research.idi.ntnu.no/hpc-lab 2008: HPC-LAB at IDI/NTNU opens in Oct. with • several NVIDIA donation • Several quad-core machines (1-2 donated by Schlumberger) Atle Rudshaug (Masters thesis, June 2008): “Optimizing & Parallelizing a Large Commercial Code for Modeling Oil-well Networks” -- joint project with Yggdrasil Andreas Bach (Masters thesis, September 2008): “Profiling and Optimizing a Seismic Application on Modern Architectures” -- joint project with Statoil Rune Hovland (Masters project, Dec 2008) : "Latency and Bandwidth Impact on GPU Systems" (ParCo 2009 w/ Elster) Daniele Giuseppe Spampinato (Masters Project, December 2008): "Linear Optimizations with CUDA (IPDPS MTAAP 2009 w/ Elster) 17 Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  18. 18. 18 Selected Master theses and Master reports supervised by Dr. Elster in 2009 1) Robin Eidissen (Masters thesis, January 2009) : http://research.idi.ntnu.no/hpc-lab "Utilizing GPUs for Real-Time Visualization of Snow” (demoed @ SC´08-SC´10) Eirik Aksnes and Henrik Hesland (MS Project, Jan 2009) : "GPU Techniques for Porous Rock Visualization” 2) Rune Erlend Jensen (Masters thesis, May 2009, currently PhD student at HPC-Lab) : "Techniques and Tools for Optimizing Codes on Modern Architectures: A Low-Level Approach” (NR MS Thesis Award!) 3) Rune Johan Hovland (Masters thesis, June 2009), Dr. Magnus Lie Hetland (co-advisor): "Throughput Computing on Future GPUs” 4) Henrik Hesland (Masters thesis, June 2009) Thorvald Natvig (co-advisor): "GPU-Enabled Interactive Pore Detection for 3D Rock Visualization " 5) Eirik Ola Aksnes (Masters thesis, July 2009) Ståle Fjeldstand & Atle Rudshaug, Numerical Rocks (co-advisors): "Simulation of Fluid Flow Through Porous Rocks on Modern GPUs" (ParCo 2009) 6) Daniel Haugen (Masters thesis, July 2009) Tore Fevang, Schlumberger (co-advisor): "Seismic Data Compression and GPU Memory Latency" 7) Åsmund Herikstad (Masters thesis, July 2009) Svein-Erik Måsøy, MedTek, NTNU (co-advisor) "Parallel Techniques for Estimation and Correction of Aberration in Medical Ultrasound Imaging" 8) Owe Johansen (Masters thesis, July 2009) John Hybertsen & Jon André Haugen, Statoil (coadvisors): "Seismic Shot Processing on GPU" 9) Daniele Giuseppe Spampinato (Masters thesis, July 2009; currently PhD student @ ETH) "Modeling Communication on Multi-GPU Systems” (ParCo 2009) Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  19. 19. 19 HPC-Lab -- Spring 2010 http://research.idi.ntnu.no/hpc-lab Dr. Anne C. Elster Lab Director Dr. John P. Ryan Post Doc Dr. Jan Perhac Post Doc Jan Christian Meyer (PhD stud) Thorvald Natvig (PhD stud.) Rune E. Jensen (PhD stud.) Master Students – Spring 2010 Ahmed Aqrawi Assist. TDT 4200 Aleksander Gjermundsen Affiliates /Visitors Andreas Hysing Øystein Krog Holger Ludvigsen Assist TDT 4205 + 2 Cybernetics students + 3 visualization students + 1-2 || arch/multicore students + 1 Marine student Eirik O. Aksnes (tentative PhD, Now consultant for Statoil) Refsnaes & Singh did FEM on GPU -NTNU CSE Kvamsdal & Elster Seminar Oct 2, 2013 Gagandeep Collarborations between Falch & Elster: The Power of GPU Computing Singh (Math)
  20. 20. 20 HPC-Lab History (contin.): http://research.idi.ntnu.no/hpc-lab 2010: - NVIDIA Fermi-based card(470, c2050, c2070(fall)) More on OpenCL Ahmed A. Aqwari (Masters thesis, June 2010): “Effects of Compression on Data Intensive Algorithms” Aleksander Gjermundsen (Masters thesis, July 2010): “Audio Processing on GPU” Andreas Hysing (Masters thesis, Aug 2010): Parallel Inversion code (w/Statoil) Øystein Krog (Masters thesis, June 2010): “GPU-based Real-Time Snow Avalanche Simulations” (SPH) Holger Ludvigsen (Masters thesis, June 2010, Dr. Frank Lindseth (co-advisor): “Real-Time GPU-Based 3D Ultrasound Reconstruction and Visualization” Thorvald Natvig (PhD Dec 2010) “Automatic Run-Time Communication and I/O” 20 Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  21. 21. HPC-Lab -- 2011 (Elster on sabatical 2010/11) 21 Dr. Anne C. Elster Lab Director Dr. Ian Karlin Post Doc Jan Christian Meyer (PhD stud) Rune E. Jensen (PhD stud.) http://research.idi.ntnu.no/hpc-lab Erik Smistad (PhD stud.) Elster co-advisor Master Students – Spring 2011 Fredrik Fossum GPU Rigid body simulation Yngve S. Lindal (GPU proj @ CERN) Affiliates /Visitors Ole-Martin Brende (MedTech) Ove Stinessen (Statoil proj) Jarle Stensland (OpenCL BLAS) Thor Kristian Valderhaug (Numerical Rocks proj Multi-GPU LBM) Geir Jostein Lien (2-yr Master Informatics, graduated 2012) Miguel Martinez-delAmor (PhD student from Spain, Falch & Elster: The Power of GPU Computing Fall 2011 NTNU CSE Seminar Oct 2, 2013
  22. 22. 22 HPC-Lab: Master theses 2012 http://research.idi.ntnu.no/hpc-lab Kjetil Babington: Terrain Rendering Techniques for the HPC-Lab Snow Simulato Thomas Løfsgaard Falch: 3D Visualization of X-ray Diffraction Data Geir Josten Lien: Auto-tunable GPU BLAS Jan Magne Rovde: Real-Time Granular Flow Simulation Using the PCISPH Method on GPGPU Devices Using CUDA Frederik Magnus Johansen Vestre: Enhancing and Porting the HPC-Lab Snow Simulator to OpenCL on Mobile Platforms Jan Christian Meyer, PhD Theses (December): Performance Modeling of Heterogeneous Systems Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  23. 23. 23 HPC-Lab: 2012/2013 http://research.idi.ntnu.no/hpc-lab Jan Christian Meyer, PhD Theses (December): Performance Modeling of Heterogeneous Systems Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  24. 24. 24 HPC-Lab: Master Theses 2013 http://research.idi.ntnu.no/hpc-lab Lark Kirkholt Melhus (June): Analyzing Contextual Bias of Program Execution on Modern CPUs Magnus Mikalsen (June): OpenACC-based Snow Simulation Andreas Nordahl (June): Enhancing the HPC-Lab Snow Simulator with More Realistic Terrains and Other Interactive Features Lars Espen Nordhus (June): Ray Tracing for Simulation of Wireless Networks in 3D Scenes Stian Aaraas Pedersen (June): Progressive Photon Mapping on GPUs Andreas Skomedal (June): Heterogeneous FTDT for Seismic Processing Henrik Holenbakken Knutsen (Sept): Enhancing Software Portability with Hardware Parametrized Autotuning Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  25. 25. 25 HPC-Lab: 2013/2014 http://research.idi.ntnu.no/hpc-lab Anne C. Elster – Director Malik Khan – Post Doc to start Nov 1, 2013 PhD students: ● Rune Jensen, ● Johannes Kvam ● Thomas Falch, Samira Pakdel, Ruben Spaans Co-supervised by Elster: ● ● ● ● ● Johannes Kvam, Erik Smistad, Mehdi Bozorgi, Lane Holloway (UT ECE Student) ● + 8 master students & 2 MedTech PhD students Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  26. 26. 26 Outline http://research.idi.ntnu.no/hpc-lab • Introduction to GPU computing • Overview of GPU projects at the HPC-Lab – 3D Real-Time Snow simulation – Flow simulations (porous media) – Surface extraction • Visualization of scattered point data Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  27. 27. 27 http://research.idi.ntnu.no/hpc-lab Snow Simulation: calc. 4+ million particles in real-time using multi-core CPU + GPU Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  28. 28. 28 http://research.idi.ntnu.no/hpc-lab Snow Simulation – Wind field Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  29. 29. 29 http://research.idi.ntnu.no/hpc-lab Add more real-time features Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  30. 30. 30 Add Road Generation (Used A* algorithm, Demo @ SC11) Falch & Elster: The Power of GPU Computing http://research.idi.ntnu.no/hpc-lab NTNU CSE Seminar Oct 2, 2013
  31. 31. 31 Add Ray-Tracing Falch & Elster: The Power of GPU Computing http://research.idi.ntnu.no/hpc-lab NTNU CSE Seminar Oct 2, 2013
  32. 32. 32 Outline http://research.idi.ntnu.no/hpc-lab • Introduction to GPU computing • Overview of GPU projects at the HPC-Lab – 3D Real-Time Snow simulation – Flow simulations (porous media) – Surface extraction • Visualization of scattered point data Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  33. 33. 33 Simulations of Fluid Flow through Porous Rocks using GPUs http://research.idi.ntnu.no/hpc-lab Eirik Ola Aksnes & A.C: Elster (ParCo 2009) + current work with Thor Kristian Valderhaug using OpenCL In collaboration with : Numerical Rocks & NTNU Chemistry Dept. Use Lattice Bolzmann Method a.k.a LBM Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  34. 34. 34 Outline http://research.idi.ntnu.no/hpc-lab • Introduction to GPU computing • Overview of GPU projects at the HPC-Lab – 3D Real-Time Snow simulation – Flow simulations (porous media) – Surface extraction • Visualization of scattered point data Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  35. 35. 35 3D Surface Extraction http://research.idi.ntnu.no/hpc-lab (w/ Dr. Frank Lindseth (SINTEF MedTek and NTNU, and MS/PhD student Erik Smistad Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  36. 36. 36 3D Surface Extraction http://research.idi.ntnu.no/hpc-lab (w/ Dr. Frank Lindseth (SINTEF MedTek and NTNU, and MS/PhD student Erik Smistad Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  37. 37. 37 3D Surface Extraction on GPUs http://research.idi.ntnu.no/hpc-lab • Use Marching Cubes – algorithm for extracting a 3D surface from a set of sampled scalars • Algorithm used extensively for visualizing and analyzing medical data (X-ray, MR) and the result of 3D segmentation. • Completely data parallel • Challenge: How to store the result of each cube in parallel on GPU Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  38. 38. 38 3D Surface Extraction -- Histogram data • http://research.idi.ntnu.no/hpc-lab Challenge: How to store the result of each cube in parallel on GPU? In serial implementation this is simple – just use a stack and add the vertex data to the stack • GPU Solution: Histogram Pyramids [1] • A datastructure that: • • • • Filters out cubes that has no triangle (stream reduction) Returns total sum of triangles Provides each cube with an index for memory storage Can be efficiently used by means of textures yielding large speed-ups [1] G. Ziegler et al: On-the-fly Point Clouds through Histogram Pyramids; Vision, Modeling, and Visualization 2006 Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  39. 39. 39 3D Surface Extraction http://research.idi.ntnu.no/hpc-lab -- Histogram Pyramids: Construction & Traversal HP Construction HP Traversal Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  40. 40. 40 3D Surface Extraction http://research.idi.ntnu.no/hpc-lab -- Results: HPMC Dyken et al. Size Exec. time 512^3 vs. Our OpenCL implementation FPS (avg) Memory Size Exec. time FPS (avg) Memory 3324 ms 0.3 490 MB 512^3 34 ms 0.3 121 MB 256^3 5 ms 223 122 MB 256^3 10 ms 105 40 MB 128^3 3 ms 394 44 MB 128^3 4 ms 233 26 MB 64^3 2 ms 519 22MB 64^3 3 ms 319 22MB Our Test system: • • • • Intel i5 750, 4GB RAM ATI Radeon 5870 (1GB RAM) AMD Catalyst 11.2 graphics driver APP SDK 2.3 w/ OpenCL 1.1 Note: OpenCL-OpenGL Synch measured to be 2-20ms, i.e. 7090<% for smallest datasets Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  41. 41. 41 Outline http://research.idi.ntnu.no/hpc-lab • Introduction to GPU computing • Overview of GPU projects at the HPC-Lab – 3D Real-Time Snow simulation – Flow simulations (porous media) – Surface extraction • Visualization of scattered point data Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  42. 42. 42 Scattered Point Data (a ) (b) Falch & Elster: The Power of GPU Computing http://research.idi.ntnu.no/hpc-lab (c) NTNU CSE Seminar Oct 2, 2013
  43. 43. 43 Examples http://research.idi.ntnu.no/hpc-lab • Sensor networks • Simulations (n-body, SPH) • Post-processing/streaming over network Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  44. 44. 44 X-ray Diffraction http://research.idi.ntnu.no/hpc-lab Detector X-ray source Q ^i k ^f k Specimen Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  45. 45. 45 Volume Ray Casting Eye/camera http://research.idi.ntnu.no/hpc-lab image Volume ray Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  46. 46. 46 Volume Ray Casting of Scattered Point Data http://research.idi.ntnu.no/hpc-lab Eye/camera Image Ray Bounding box Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  47. 47. 47 http://research.idi.ntnu.no/hpc-lab Interpolation r2 r3 r1 Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  48. 48. 48 Finding Neighbors Falch & Elster: The Power of GPU Computing http://research.idi.ntnu.no/hpc-lab NTNU CSE Seminar Oct 2, 2013
  49. 49. 49 http://research.idi.ntnu.no/hpc-lab Optimizations • Empty space skipping • Early ray termination • Filtering C B A C B A Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  50. 50. 50 GPU Implementation • • • • http://research.idi.ntnu.no/hpc-lab CUDA C, almost same code One thread for each ray/pixel Remove recursion (for older hardware) Texture memory Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  51. 51. 51 Multi GPU http://research.idi.ntnu.no/hpc-lab • Load distribution challenging – Different hardware – Different amount of work per thread (ray/pixel) • Use previous image to divide work for next • Ray length as proxy for amount of work Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  52. 52. 52 Results Falch & Elster: The Power of GPU Computing http://research.idi.ntnu.no/hpc-lab NTNU CSE Seminar Oct 2, 2013
  53. 53. 53 Results Falch & Elster: The Power of GPU Computing http://research.idi.ntnu.no/hpc-lab NTNU CSE Seminar Oct 2, 2013
  54. 54. 54 Results Falch & Elster: The Power of GPU Computing http://research.idi.ntnu.no/hpc-lab NTNU CSE Seminar Oct 2, 2013
  55. 55. 55 Results Falch & Elster: The Power of GPU Computing http://research.idi.ntnu.no/hpc-lab NTNU CSE Seminar Oct 2, 2013
  56. 56. 56 Results Falch & Elster: The Power of GPU Computing http://research.idi.ntnu.no/hpc-lab NTNU CSE Seminar Oct 2, 2013
  57. 57. 57 Results Falch & Elster: The Power of GPU Computing http://research.idi.ntnu.no/hpc-lab NTNU CSE Seminar Oct 2, 2013
  58. 58. 58 Results Falch & Elster: The Power of GPU Computing http://research.idi.ntnu.no/hpc-lab NTNU CSE Seminar Oct 2, 2013
  59. 59. 59 http://research.idi.ntnu.no/hpc-lab Questions? Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  60. 60. 60 Thank yous to: http://research.idi.ntnu.no/hpc-lab HPC-Lab Post Docs and grad. students! @ SC´07 Spring 2007 Spring 2010 Spring 2009 Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  61. 61. 61 http://research.idi.ntnu.no/hpc-lab Modeling Heterogeneous Systems “Optimized Barriers for Heterogeneous Systems Using MPI” Jan Christian Meyer PhD Student Finishing summer 2011 (to be presented at IEEE IPDPS 2011, HCW) Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  62. 62. 62 Dealing with bandwidth issues: Compression of Large Seismic Datasets on GPU (Aqrawi & Elster IPDPS 2011) http://research.idi.ntnu.no/hpc-lab Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  63. 63. 63 Motivation http://research.idi.ntnu.no/hpc-lab Locality & I/O – challenge for data intensive algorithms Look at techniques for reducing Mem. Bandwidth – Hardware: HDD, SSD – Compression: JPEG, MPEG, MP3 ... – Explore GPU compression capabilities Seismic filtering process – Transform coding works well for signal data* * [H.S.Malvar 1992], [L.C.Duval 2000], [C.Larsen 2006], [D.Haugen 2009] Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  64. 64. 64 http://research.idi.ntnu.no/hpc-lab Results GPU acceleration Execution time comparison to FERMI arcitechture 700 600 Execution time (s) 500 Intel i7 Single Intel i7 Quad Nvidia Tesla c1060 Nvidia Tesla c2050 400 300 200 100 0 DCT 3D DCT AAN 3D LOT 1D Algorithm Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013
  65. 65. Falch & Elster: The Power of GPU Computing 3D AAN(GPU) 2D AAN(GPU) 1D AAN(GPU) 2D DCT (GPU) 1D DCT (GPU) 3D AAN(Quad) 2D AAN(Quad) 1D AAN(Quad) 2D DCT (Quad) I/O Speedup HDD 1D DCT (Quad) 3D AAN (Single) 2D AAN (Single) 1D AAN (Single) 2D DCT (Single) 1D DCT (Single) Huffman (GPU) Huffman (Quad) Huffman (Single) RLE(Quad) RLE (Single) I/O speed up compa red to pla tform 65 http://research.idi.ntnu.no/hpc-lab Results I/O Speedup I/O Speedup SSD 7 6 5 4 3 2 1 0 Compression algor ithm NTNU CSE Seminar Oct 2, 2013
  66. 66. 66 Summary Compression http://research.idi.ntnu.no/hpc-lab – – When optimizing for I/O need efficent compression rate AND fast compression algorithm Compression can give up to: – 6.2 I/O speedup on HDD (70MB/s) – 3.9 I/O speedup on SSD (140MB/s) – Achieved through – Transform coding – CPU & GPU co-op – Asynch I/O – – Predictive model accurate within 5% Seismic compression library Falch & Elster: The Power of GPU Computing NTNU CSE Seminar Oct 2, 2013

×