Interactive Latency in Big Data Visualization

Interactive Latency in
Big Data Visualization
Zhicheng “Leo” Liu
Jan 22, 2014

Latency:
a measure of time delay experienced in a system
rotational latency
network latency
query latency
interactive latency

Questions
How to reduce interactive latency in big data visualization?
How does interactive latency aﬀect user behavior?

Reducing Latency
More memory
in-memory data store
Clever indexing
cube representation schemes
Parallel processing
multicore, GPGPU, distributed platforms

imMens: a holistic approach
Perceptual scalability
Binned aggregation as primary data reduction strategy
Interactive scalability
Multivariate data tiles
Parallel query processing and rendering on GPU
[Liu et. al. 2013]

Guiding Principle
Perceptual & interactive scalability should be limited
by the chosen resolution of the visualized data,
not the number of records.

14

Data Sampling
Modeling

15

Data Sampling
Modeling Binned Aggregation

Google Fusion Tables: Sampling
16

Sampling

Binned Plots: Design Space
18

numeric
ordinal/categorical
temporal
geographic

1D

2D

Multivariate Data Tiles
21
Projections / Materialized database views
Provide data for dynamic visualization
Much faster than a traditional data cube

Brush & Link: A Naïve Approach
23

X!
Y!
256
…
767
512 1023…
Day!
Hour!
Month!
23
…
0 1 … 30
0
…
11
1
23
…
0
…
11
0 1 … 30 0 1 … 30
0
23
…
0
11
1
0
…
1
0
12 x 31 x 24 x 512 x 512 = ~2.3 billion cells

Brushing Over January
24

X!
Y!
256
…
767
512 1023…
Day!
Hour!
Month!
23
…
0 1 … 30
0
…
11
1
23
…
0
…
11
0 1 … 30 0 1 … 30
0
23
…
0
11
1
0
…
1
0
31 x 24 x 512 x 512 = ~195 million cells

Sum Along Day
25

X!
Y!
256
…
767
512 1023…
[ 0 – 30 ]
Day!
Hour!
Month!
23
…
0
…
11
1
23
…
0
…
11
[ 0 – 30 ] [ 0 - 30 ]
0
23
…
0
11
1
0
…
1
0
24 x 512 x 512 = ~6 million cells

Sum Along Hour
26

X!
Y!
256
…
767
512 1023…
[ 0 – 30 ]
Day!
Hour!
Month!
[ 0 – 23 ]
0
…
11
0
…
11
[ 0 – 30 ] [ 0 - 30 ]
[ 0 – 23 ]
0
11
…
[ 0 – 23 ]
512 x 512 cells

Decomposing a Data Cube
27

For any pair of 1D or 2D binned plots, the
maximum number of dimensions needed
to support brushing & linking is 4.
full 5-D cube!
Day!
Hour!
Month!
0 1 … 30
0
…
11
Y!
Hour!
X!
512 513 … 1023
256
…
767
Y!
Day!
X!
512 513 … 1023
256
…
767
Y!
Month!
X!
512 513… 1023
256
…
767
3-D !
cubes!
23
…
1
0
23
…
1
0
30
…
1
0
11
…
1
0
Σ
Σ
Σ
Σ

Tiles
29

X: 256-511 X: 512-767
Y:512-767Y:768-1023
Day: 31 bins

Y:
512
-‐
1023

day:

0
-‐
31

From Datacube to Data Tiles
30

512 513 … 767
256
…
511
30
…
1
0
512 513 … 767
512
…
767
30
…
1
0
768 769 … 1023
256
…
511
30
…
1
0
768 769 … 1023
512
…
767
30
…
1
0

imMens Architecture
46
SciDB,
Postgres

Client

Server

UI
control
VisualizaHon

specify

brush

&
link

zoom
&
pan

Client-Side Processing
47

0
1
…
11
768 769 … 1023
512
513
…
767
R
G
B
A

R
G
B
A

…
…
…
…

R
G
B
A

data
Hles

query

fragment

shader

Y
[768-‐1023]

X
[512-‐767]

{
0
1
…
11
Pass
1

projecHons
oﬀ-‐screen
FBO

render

fragment

shader

Pass
2

canvas

Pack
data
Hles
as
images
(352KB
for
Brightkite)

Bind
to
WebGL
context
as
textures

48

Simulate brush & linking across
plots in a scatter plot matrix
imMens vs. full data cube
60 synthesized datasets
Parameters
bin count per dimension
(10,20,30,40,50)
number of records
(10K, 100K, 1M, 10M, 100M, 1B)
number of dimensions (4,5)
Performance Benchmarks

49

Google Chrome v.23.0.1271.95 on a quad-core 2.3 GHz MacBook Pro (OS X 10.8.2) with per-core 256K L2
caches, shared 6MB L3 cache and 8GB RAM. PCI Express NVIDIA GeForce GT 650M graphics card with
1024MB video RAM.
51.9
52.3
51.6
52.0
53.2
52.1

5.5

3.0
2.2

50

1024MB video RAM.
51.9
52.3
51.6
52.0
53.2
52.1

5.5

3.0
2.2

51

1024MB video RAM.
51.9
52.3
51.6
52.0
53.2
52.1

5.5

3.0
2.2

50fps querying and
rendering of 1B data points

Newell (1994): Unified Theories of Cognition

Newell (1994) Card et al (1983) Example Time Range
deliberate act perceptual fusion recognize a pattern,
track animation
~100 milliseconds
cognitive operation unprepared response click a link,
select an object
~1 second
unit task unit task edit a line of text,
make a chess move
~10 seconds

Deictic Strategy
Pointing movements bind objects in the world

Small changes in cost of binding
cause diﬀerent cognitive behavior

Latency aﬀects high-level/longitudinal strategies
Block-copying
Ballard et al (1995, 1997)
8-puzzle solving
O’Hara and Payne (1998, 1999)
Search
Brutlag (2009)

Operation Low High
brush & link ~20ms ~20ms + 500ms
select ~20ms ~20ms + 500ms
pan ~100ms ~100ms + 500ms
zoom ~1000ms ~1000ms + 500ms
Latency Conditions

Study Design
16 participants, 32 observations
2 X 2 between subject
interaction logs
audio transcripts

Log Events
System and Mouse Events
brush, select, zoom, pan, clear, color slider, log scale
tiles cached,
mouse down, mouse up, mouse move
Trigger vs. Processed System Events
debouncing keeps system usable
timestamp, event type, parameters

How to Evaluate Performance?
The purpose of visualization is insight,
not pictures.

What is an insight?
"many new airlines emerged around year 2003”
"HP started in 2001, AS in 2003, PI in 2004, OH in 2003”
“OH started in 2003, and they are doing pretty well
in terms of delays”

Questions
imMens: a system supporting real-time interaction
binned aggregation for perceptual scalability
multivariate data tiles & GPU processing for low latency
Comparative study: quantitative & qualitative analysis

Questions

Questions
User study: quantitative & qualitative analysis

Acknowledgment
Jeﬀrey Heer
Biye Jiang

Interactive Latency in Big Data Visualization

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (10)

Similar to Interactive Latency in Big Data Visualization

Similar to Interactive Latency in Big Data Visualization (20)

Recently uploaded

Recently uploaded (20)

Interactive Latency in Big Data Visualization