WT-4065, Superconductor: GPU Web Programming for Big Data Visualization, by Leo Meyerovich and Matthew Torok

1,137 views

Published on

Presentation WT-4065, Superconductor: GPU Web Programming for Big Data Visualization, by Leo Meyerovich and Matthew Torok at the AMD Developer Summit (APU13) Nov. 11-13, 2013.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,137
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
19
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

WT-4065, Superconductor: GPU Web Programming for Big Data Visualization, by Leo Meyerovich and Matthew Torok

  1. 1. SUPERCONDUCTOR   A Browser Framework for Visualizing Big Data Leo  Meyerovich,  Ma.  Torok,  Ras  Bodik   @LMeyerov   UC  Berkeley  /  Graphistry   1  
  2. 2. Why Big Data Visualization? Yes No
  3. 3. Analysis  Result:    No   3  
  4. 4. Histogram of Voter Turnout by Town who’s ballot stuffing? # Towns most towns had a 40% voter turnout 0% 75% 25% 100% 50% Voter turnout 4  
  5. 5. Tree Map Demo
  6. 6. Ex: Time Series in IBM’s IT Monitor
  7. 7. GE Demo
  8. 8. Browser Engine ~= Chart Engine! DSLs render layout selectors parse Exploit Parallelism in Each One
  9. 9. Deploy Today via Parallel JavaScript data   styling   widgets   Parser.js   GPU   Compiler   Data  stays   on  GPU!   Selectors.CL   Layout.CL   JavaScript  VM   data  viz   Renderer.GL   webpage   Parser   Selectors   HTML  data   CSS  styling   JS  script   Layout   Renderer   Pixels   superconductor.js   9  
  10. 10. DSL 1: Data via JSON JavaScript, Ruby, Python, Java, … Easy… until 1-10s data loading 10  
  11. 11. Parsing Demo 11  
  12. 12. DSL 2: Designers Selectors <div>   <span>   <b>   <i>   <p>   <b>   <img  class=“dog”>   <b>   span    b    {  width:  83%  }                      div    .dog    {  float:  leJ  }                      p    ,    span  b    {  font-­‐size:  7px  }     12  
  13. 13. Problem: O(sels * tree log tree ) <div>   1K-100K HTML nodes <span>   <p>   × <b>   <i>   1-10K selectors <b>   <img  class=“dog”>   <b>   span    b    {  width:  83%  }                      div    .dog    {  float:  leJ  }                      p    ,    span  b    {  font-­‐size:  7px  }     13  
  14. 14. Good News: Embarrassing Parallelism! <div>   <span>   <b>   <i>   <p>   <b>   <img  class=“dog”>   <b>   span    b    {  width:  83%  }                      div    .dog    {  float:  leJ  }                      p    ,    span  b    {  font-­‐size:  7px  }     14  
  15. 15. Selector Engine Implementation selectors.css Dynamic Animation! edit style at runtime then recompile compiler.js selectors.webcl …
  16. 16. DSL 3: Layout CSS JS parallelizable layout flexible compute FTL parallelizable compute in declarative layout
  17. 17. Step  1/2:  Schema  of  VisualizaYon   Tree class hierarchy Node attributes 17  
  18. 18. [Kastens  1980,  Saraiva  2003]  [WWW  2010,   Step 2/2: Schema AttributePPOPP  2013]   Constraints 1.  Local     10px   5px   inputs   vars   2.  Single-­‐assignment   HBox   Leaf   y   w   h  x   Leaf   y   w  h   x   y   w   h  x   HBox   HBox ! left=IBox right=IBox w := left.w + right.w … Root   y   w  h   x   Leaf   y   w   h  x   Leaf   y   w   h   x   18  
  19. 19. llel ara P [WWW  2010]   Compiler Output: Layout as Tree Traversals logical  joins   w,h   w,h   w,h   Leaf   x,y   …   logical  spawns   w,h   w,h   w,h   Parallelism in each traversal!   Mozilla, Microsoft 1. Works for all data sets 2. Compiler automatically parallelizes! 19  
  20. 20. DSL 4: Rendering as a Layout Extension HBox ! left=IBox right=IBox @render @Rectangle(x,y,w,h,color) … w := left.w + right.w …
  21. 21. [Blelloch  93]   Traversals: Flattened & Level-Synchronous y x wh Array per attribute level  1   Nodes in arrays parallel for loop (level synchronous) level  n   Tree Compiler automates code + data transformations. 21  
  22. 22. Problem: Dynamic Memory Allocation on GPU? rect(…); … oval(…) square(…) function circ(x,y,r) { buffer = new line(…); … Array(r*10) for (i = 0; i < r * 10; i+ circ(…) +) buffer[i] = dynamic Math.cos(i) allocation" rect(…); …} 1.0 0.8 0.5 0.2 0 0.2 22  
  23. 23. Dynamic Allocation as SIMD Traversals allocRect(…)! 7 fillRect(…) allocLine(…)! 6 allocCirc(…)à 4 allocRect(…)! 6 1.0 0.8 0.5 0.2 0 0.2 1.0 0.8 0.5 0.2 1. Prefix sum for needed space 2. Allocate buffers fillLine(…) fillCirc(…) fillRect(…) 1.0 0.8 0.5 0.2 0 0.2 3. Fill vertex buffers in parallel 4. Give OpenGL buffers pointer 23  
  24. 24. CPU vs. GPU for Election Treemap: 5 traversals over 100K nodes Naïve JS (Chrome 26) 10,000 GPU (Safari + WebCL 11/3) COMBINED: 54X ! Time (ms) 1,000 100 24fps WebCL: 5X WebCL: 31X 10 1 layout (4 passes) rendering pass TOTAL 24  
  25. 25. DSLs for Big Data Visualization, Today.
  26. 26. Superconductor •  Explore data with interactive visualization •  Script charts like web pages: DSLs! •  Hardware accelerate each DSL •  We use WebCL: GPGPU, keeps data on GPU, dynamic compilation
  27. 27. Find us! sc-lang.com Leo: @LMeyerov / LMeyerov@gmail.com Matt: mtorok@berkeley.edu
  28. 28. Extra  
  29. 29. Parsing Demo 29  
  30. 30. Optimizing JSON Parsing raw.json: 23MB Parallel parsing easy! … when you fix the format partition raw1.json(1.9MB), …, raw12.json compress + zip csr1.zip (0.2MB), …, csr12.zip server   browser   Each worker: parallel parse parallel parse 1.  native JSON parse # csr.jso parallel parse 2.  decompress # obj.json 3. 0-copy return: typed arrays! big JavaScript object 30  

×