Interactive Latency in Big Data Visualization

856 views

Published on

Interactive Latency in Big Data Visualization
Zhicheng "Leo" Liu, Research Scientist at the Creative Technologies Lab at Adobe Research
January 22nd, 2014

Reducing interactive latency is a central problem in visualizing large datasets. I discuss two inter-related projects in this problem space. First, I present the imMens system and show how we can achieve real-time interaction at 50 frames per second for billions of data points by combining techniques such as data tiling and parallel processing. Second, I discuss an ongoing user study that aims to understand the effect of interactive latency on human cognitive behavior in exploratory visual analysis.

Big Data Visualization Meetup - South Bay
http://www.meetup.com/Big-Data-Visualisation-South-Bay/

Published in: Technology, Art & Photos
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
856
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
20
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Interactive Latency in Big Data Visualization

  1. 1. Interactive Latency in Big Data Visualization Zhicheng “Leo” Liu Jan 22, 2014
  2. 2. Latency: a measure of time delay experienced in a system rotational latency network latency query latency interactive latency
  3. 3. Questions How to reduce interactive latency in big data visualization? How does interactive latency affect user behavior?
  4. 4. Questions How to reduce interactive latency in big data visualization? How does interactive latency affect user behavior?
  5. 5. Reducing Latency More memory in-memory data store Clever indexing cube representation schemes Parallel processing multicore, GPGPU, distributed platforms
  6. 6. imMens: a holistic approach Perceptual scalability Binned aggregation as primary data reduction strategy Interactive scalability Multivariate data tiles Parallel query processing and rendering on GPU [Liu et. al. 2013]
  7. 7. imMens: a holistic approach Perceptual scalability Binned aggregation as primary data reduction strategy Interactive scalability Multivariate data tiles Parallel query processing and rendering on GPU [Liu et. al. 2013]
  8. 8. Guiding Principle Perceptual & interactive scalability should be limited by the chosen resolution of the visualized data, not the number of records.
  9. 9. 10   Data
  10. 10. 11   Data Alpha-blending
  11. 11. 12   Data
  12. 12. 13   Data Sampling
  13. 13. 14   Data Sampling Modeling
  14. 14. 15   Data Sampling Modeling Binned Aggregation
  15. 15. Google Fusion Tables: Sampling 16   Sampling
  16. 16. 17   Aggregation
  17. 17. Binned Plots: Design Space 18   numeric   ordinal/categorical   temporal   geographic   1D   2D  
  18. 18. imMens: a holistic approach Perceptual scalability Binned aggregation as primary data reduction strategy Interactive scalability Multivariate data tiles Parallel query processing and rendering on GPU [Liu et. al. 2013]
  19. 19. Demo
  20. 20. Multivariate Data Tiles 21 Projections / Materialized database views Provide data for dynamic visualization Much faster than a traditional data cube
  21. 21. 22  
  22. 22. Brush & Link: A Naïve Approach 23   X! Y! 256 … 767 512 1023… Day! Hour! Month! 23 … 0 1 … 30 0 … 11 1 23 … 0 … 11 0 1 … 30 0 1 … 30 0 23 … 0 11 1 0 … 1 0 12 x 31 x 24 x 512 x 512 = ~2.3 billion cells
  23. 23. Brushing Over January 24   X! Y! 256 … 767 512 1023… Day! Hour! Month! 23 … 0 1 … 30 0 … 11 1 23 … 0 … 11 0 1 … 30 0 1 … 30 0 23 … 0 11 1 0 … 1 0 31 x 24 x 512 x 512 = ~195 million cells
  24. 24. Sum Along Day 25   X! Y! 256 … 767 512 1023… [ 0 – 30 ] Day! Hour! Month! 23 … 0 … 11 1 23 … 0 … 11 [ 0 – 30 ] [ 0 - 30 ] 0 23 … 0 11 1 0 … 1 0 24 x 512 x 512 = ~6 million cells
  25. 25. Sum Along Hour 26   X! Y! 256 … 767 512 1023… [ 0 – 30 ] Day! Hour! Month! [ 0 – 23 ] 0 … 11 0 … 11 [ 0 – 30 ] [ 0 - 30 ] [ 0 – 23 ] 0 11 … [ 0 – 23 ] 512 x 512 cells
  26. 26. Decomposing a Data Cube 27   For any pair of 1D or 2D binned plots, the maximum number of dimensions needed to support brushing & linking is 4. full 5-D cube! Day! Hour! Month! 0 1 … 30 0 … 11 Y! Hour! X! 512 513 … 1023 256 … 767 Y! Day! X! 512 513 … 1023 256 … 767 Y! Month! X! 512 513… 1023 256 … 767 3-D ! cubes! 23 … 1 0 23 … 1 0 30 … 1 0 11 … 1 0 Σ   Σ   Σ   Σ  
  27. 27. 28  
  28. 28. Tiles 29   X: 256-511 X: 512-767 Y:512-767Y:768-1023 Day: 31 bins
  29. 29. Y:  512  -­‐  1023   day:    0  -­‐  31   From Datacube to Data Tiles 30   512 513 … 767 256 … 511 30 … 1 0 512 513 … 767 512 … 767 30 … 1 0 768 769 … 1023 256 … 511 30 … 1 0 768 769 … 1023 512 … 767 30 … 1 0
  30. 30. Data Tiles 31  
  31. 31. x1-y1-month 32  
  32. 32. x1-y1-day 33  
  33. 33. x1-y1-hour 34  
  34. 34. x1-y2-month 35  
  35. 35. x1-y2-day 36  
  36. 36. x1-y2-hour 37  
  37. 37. x2-y1-month 38  
  38. 38. x2-y1-day 39  
  39. 39. x2-y1-hour 40  
  40. 40. x2-y2-month 41  
  41. 41. x2-y2-day 42  
  42. 42. x2-y2-hour 43  
  43. 43. month-day-hour 44  
  44. 44. 45  
  45. 45. imMens Architecture 46  SciDB,  Postgres   Client   Server   UI  control   VisualizaHon   specify   brush     &  link   zoom  &  pan  
  46. 46. Client-Side Processing 47   0 1 … 11 768 769 … 1023 512 513 … 767 R   G   B   A   R   G   B   A   …   …   …   …   R   G   B   A   data  Hles   query   fragment   shader   Y  [768-­‐1023]   X  [512-­‐767]   {  0 1 … 11 Pass  1   projecHons   off-­‐screen  FBO   render   fragment   shader   Pass  2   canvas   Pack  data  Hles  as  images  (352KB  for  Brightkite)   Bind  to  WebGL  context  as  textures    
  47. 47. 48   Simulate brush & linking across plots in a scatter plot matrix imMens vs. full data cube 60 synthesized datasets Parameters bin count per dimension (10,20,30,40,50) number of records (10K, 100K, 1M, 10M, 100M, 1B) number of dimensions (4,5) Performance Benchmarks
  48. 48. 49   Google Chrome v.23.0.1271.95 on a quad-core 2.3 GHz MacBook Pro (OS X 10.8.2) with per-core 256K L2 caches, shared 6MB L3 cache and 8GB RAM. PCI Express NVIDIA GeForce GT 650M graphics card with 1024MB video RAM. 51.9   52.3   51.6   52.0   53.2   52.1   5.5   3.0   2.2  
  49. 49. 50   Google Chrome v.23.0.1271.95 on a quad-core 2.3 GHz MacBook Pro (OS X 10.8.2) with per-core 256K L2 caches, shared 6MB L3 cache and 8GB RAM. PCI Express NVIDIA GeForce GT 650M graphics card with 1024MB video RAM. 51.9   52.3   51.6   52.0   53.2   52.1   5.5   3.0   2.2  
  50. 50. 51   Google Chrome v.23.0.1271.95 on a quad-core 2.3 GHz MacBook Pro (OS X 10.8.2) with per-core 256K L2 caches, shared 6MB L3 cache and 8GB RAM. PCI Express NVIDIA GeForce GT 650M graphics card with 1024MB video RAM. 51.9   52.3   51.6   52.0   53.2   52.1   5.5   3.0   2.2   50fps querying and rendering of 1B data points
  51. 51. Speed of Thought?
  52. 52. Questions How to reduce interactive latency in big data visualization? How does interactive latency affect user behavior?
  53. 53. Newell (1994): Unified Theories of Cognition
  54. 54. Newell (1994) Card et al (1983) Example Time Range deliberate act perceptual fusion recognize a pattern, track animation ~100 milliseconds cognitive operation unprepared response click a link, select an object ~1 second unit task unit task edit a line of text, make a chess move ~10 seconds
  55. 55. ~300ms: The Embodiment Level
  56. 56. Deictic Strategy Pointing movements bind objects in the world
  57. 57. Small changes in cost of binding cause different cognitive behavior
  58. 58. Latency affects high-level/longitudinal strategies Block-copying Ballard et al (1995, 1997) 8-puzzle solving O’Hara and Payne (1998, 1999) Search Brutlag (2009)
  59. 59. Exploratory Visual Analysis?
  60. 60. Operation Low High brush & link ~20ms ~20ms + 500ms select ~20ms ~20ms + 500ms pan ~100ms ~100ms + 500ms zoom ~1000ms ~1000ms + 500ms Latency Conditions
  61. 61. Datasets
  62. 62. Study Design 16 participants, 32 observations 2 X 2 between subject interaction logs audio transcripts
  63. 63. Log Events System and Mouse Events brush, select, zoom, pan, clear, color slider, log scale tiles cached, mouse down, mouse up, mouse move Trigger vs. Processed System Events debouncing keeps system usable timestamp, event type, parameters
  64. 64. Normalized Processed Events
  65. 65. How to Evaluate Performance? The purpose of visualization is insight, not pictures.
  66. 66. Counting Insights
  67. 67. What is an insight? "many new airlines emerged around year 2003” "HP started in 2001, AS in 2003, PI in 2004, OH in 2003” “OH started in 2003, and they are doing pretty well in terms of delays”
  68. 68. Questions How to reduce interactive latency in big data visualization? imMens: a system supporting real-time interaction binned aggregation for perceptual scalability multivariate data tiles & GPU processing for low latency How does interactive latency affect user behavior? Comparative study: quantitative & qualitative analysis
  69. 69. Questions How to reduce interactive latency in big data visualization? imMens: a system supporting real-time interaction binned aggregation for perceptual scalability multivariate data tiles & GPU processing for low latency How does interactive latency affect user behavior?
  70. 70. Questions How to reduce interactive latency in big data visualization? imMens: a system supporting real-time interaction binned aggregation for perceptual scalability multivariate data tiles & GPU processing for low latency How does interactive latency affect user behavior? User study: quantitative & qualitative analysis
  71. 71. Acknowledgment Jeffrey Heer Biye Jiang
  72. 72. Thank You

×