Visualization for Event Sequences Exploration

Data Visualization Summit
San Francisco, CA
Apr 11, 2013

Visualizations for
Event Sequences Exploration
Krist Wongsuphasawat

Data Visualization Scientist
Twitter, Inc.

@kristw

event%
event%
event% event% event%
event%
event%
event%
Life event%
event%
event%
event%
event% event%
event%
event%

Time Event type%

( 7:00 am, Wake up )

event%
event%
event%
event%
event%
Life event%
event%
event%
event%
event% event%
event%
event%

event%
event%
event%
event%
event%
Life event%
event%
event%
event%
event% event%
event%
event%

“Event Sequence”

Daily Activity

7:30 a.m. 7:45 a.m. 8:30 a.m.
Wake Up Exercise Go to work

Traffic Incidents

9:30 a.m. 9:55 a.m. 10:30 a.m.
Notiﬁcation Units arrived Road cleared

http://timeline.national911memorial.org/

Event Sequences
Medical Transportation

Sports Education

Web logs Logistics

and more…

Outline
?
u ences
e nt seq
hat are ev them?
W
is ualize
Ho w to v a
b ig dat
Ap ply to

Event
glyphs timeline
sequence

simple event sequence
timeline.js

Horizontal axis = time
Glyphs = events

http://timeline.verite.co/

Event
glyphs timeline
sequence

+ Interval

interval
•  Car crash (point) time

10 a.m.
•  Meeting (interval)
10 – 11 a.m.

interval >> width
traffic incident

CATT Lab, University of Maryland -- http://teachamerica.com/VIZ11/VIZ1102Pack/index.htm

interval >> width
chronoline.js

http://stoicloofah.github.io/chronoline.js/

Event
glyphs timeline
sequence

+ Interval width

+ Event
types

types

time

Nurses’ actions Doctors’ actions

They all look similar.

types

time

Nurses’ actions Doctors’ actions

Better?

The path of protest
types >> color

http://www.guardian.co.uk/world/interactive/2011/mar/22/middle-east-protest-interactive-timeline

types >> colors + shapes

http://timeglider.com/widget/
timeglider.js

Event
glyphs timeline
sequence

+ Interval width

+ Event
colors shapes
types

High
+
density

high density

time

Too many overlaps and occlusions

high density >> facet
Google Chrome

loading
scripting
rendering & painting

Facet

Google Chrome > Developer Tools > Timeline

high density >> facet
Lifelines

http://www.cs.umd.edu/lifelines

high density >> binning
British History Timeline

bin by year

high density >> aggregation
CloudLines

Raw event data

Kernel Density Estimation + Importance Func. + Truncation

Encode cloud size

high density >> aggregation
CloudLines (2)

Krstajic, M., Bertini, E., & Keim, D. A. (2011).
CloudLines: Compact Display of Event Episodes in Multiple Time-Series.
IEEE Transactions on Visualization and Computer Graphics, 17(12), 2432.

linear
Event
glyphs timeline
sequence
non-linear

+ Interval width

+ Event
colors shapes
types

High
+ facet aggregation binning
density

circular timeline
2008 2009 2010 2011 2012

linear
Dec Jan Feb

Nov Mar

circular Oct Apr
repeating patterns
Sep May

Aug Jun
Jul

circular timeline (2)
Traffic Incidents

VanDaniker, M. (2010). Leverage of Spiral Graph for Transportation System Data Visualization.
Transportation Research Record: Journal of the Transportation Research Board, 2165, 79–88.

stacked timeline
2008 2009 2010 2011 2012

linear

2008
2009
2008
2009

2011
2010

2012

2010
2011
2012

stacked timeline (2)
Tweet Volume

Rios, M., & Lin, J. (2012). Distilling Massive Amounts of Data into Simple Visualizations : Twitter Case Studies.
Proceedings of the Workshop on Social Media Visualization (SocMedVis) at ICWSM 2012 (pp. 22–25).

collection
1 2 n

Event Event ... Event
sequence sequence sequence

collection
multiple timelines

Event sequence #1

Event sequence #2

Event sequence #3

Event sequence #4

collection
1 2 n


Millions!

collection
1 2 n


Interactions

Interaction #2
rank

Rank by number of events
or any criteria

Interaction #3
ﬁlter

Select only event sequences with events
Set your own ﬁlters

Interaction #4
group

1

2

3

Group by sequence length
or any clustering algorithm / properties

Interaction #5
search
•  Simple search ABC
–  Sequence matching AABCDEFGH
–  Subsequence matching AXAYBZCED

•  Regular Expression A B* (C|D)

Interaction #5
search (2)
•  Dynamic X 50% C 75%
AB
Y 50% D 25%

Interaction #5
search (2)
•  Dynamic X 70% D 50%
ABC
Y 30% E 50%

•  Similarity search Similar to ABCD

ABCD
ABD
ACE
…

collection
1 2 n


Interactions Aggregation

align
by
time
rank search

ﬁlter group

aggregation by time
temporal summary
Day 1 Day 2 Day 3 Day 4 Day 5

bin & count

aggregation by time
temporal summary

Wang, T. D., Plaisant, C., Shneiderman, B., Spring, N., Roseman, D., Marchand, G., Mukherjee, V., et al. (2009).
Temporal Summaries: Supporting Temporal Categorical Searching, Aggregation and Comparison.
IEEE Transactions on Visualization and Computer Graphics, 15(6), 1049–1056.

collection
1 2 n



align
by
time
rank search by
sequence
ﬁlter group

aggregation by sequence
LifeFlow
e.g. 1) What happened to the patients after they arrived?

Arrival!
?
?
2) What happened to the patients before & after ICU?

ICU!

? ?
? ?

LifeFlow
overview / summary

Millions of records!

Demo
LifeFlow

Wongsuphasawat, K., Guerra Gómez, J. A., Plaisant, C., Wang, T. D., Taieb-Maimon, M., & Shneiderman, B. (2011).
LifeFlow: Visualizing an Overview of Event Sequences. Proceedings of CHI'2011 (pp. 1747–1756).

LifeFlow

proﬁle! home!

start! home! photos! home!

contact! home!

Google Analytics

proﬁle!

start! home! photos! home!

contact!

http://www.google.com/analytics

Google Analytics

proﬁle!

home!

start! home! photos!

videos!

contact!


Google Analytics

top pages only

height = number of visits


Event
+ Outcome
sequence

Time%

Game #1 Win (1)

10th minute 25th minute 90th minute
Goal Concede Goal

or any sports

Time%

Game #1 Win (1)
Goal% Concede% Goal%

Game #2 Win (1)
Goal% Goal% Concede%

Game #3 Lose (0)
Goal% Concede% Concede%

Game #n Win (1)
Concede% Goal% Goal% Goal%

aggregation by sequence with outcome
Outﬂow (Careﬂow)
overview / summary

Event Sequences!
with Outcome!

Assumption
Events are persistent.

Record #1
e1% e2% e3%

Record #1

Assumption

Record #1
e1% e2% e3%

Record #1
e1% e1% e1%

Assumption

Record #1
e1% e2% e3%

Record #1
e1% e1% e1%
e2% e2%

Assumption

Record #1
e1% e2% e3%

Record #1
e1% e1% e1%
e2% e2%
e3%

Assumption

Record #1
e1% e2% e3%

Record #1
e1% e1% e1%
[e1] e2% e2%
e3%
States [e1, e2]
[e1, e2, e3]

Select alignment point
Pick a state

What are the paths What are the paths
that led to ? after ?

Example
Soccer: Goal, Concede, Goal

Outﬂow Graph
Alignment Point

[e1, e2, e3]!

1%record%
Outﬂow Graph
Alignment Point

[e1]! [e1, e2]!

[ ]!

[e1, e2, e3]!
[e1, e2, e3, e5]!

2%records%
Outﬂow Graph
Alignment Point

[e1]! [e1, e2]!

[ ]! [e1, e3]!

[e1, e2, e3]!
[e1, e2, e3, e5]!

3%records%
Outﬂow Graph
Alignment Point

[e1]! [e1, e2]!
[e1, e2, e3, e4]!

[ ]! [e1, e3]!

[e1, e2, e3]!
[e1, e2, e3, e5]!
[e3]!

n%records%
Outﬂow Graph
Alignment Point

[e1]! [e1, e2]!
[e1, e2, e3, e4]!

[ ]! [e2]! [e1, e3]!

[e1, e2, e3]!
[e1, e2, e3, e5]!
[e3]! [e2, e3]!

n%records%
Outﬂow Graph
Alignment Point

[e1]! [e1, e2]!
[e1, e2, e3, e4]!

[ ]! [e2]! [e1, e3]!

[e1, e2, e3]!
[e1, e2, e3, e5]!
[e3]! [e2, e3]!
Average outcome = 0.4
Average time = 10 days
No. of records = 10

Soccer Results
Alignment Point

1-0! 2-0!
2-2!

0-0! 1-1!

2-1!
3-1!
0-1! 0-2!

Past& Future&
Alignment%

Node’s horizontal position
shows sequence of states.%
e1!
e2!
e3!
End of path%
e1!

e1!
e2!
7me% link% e1!
Node’s height is
edge% edge% e2!
number of records.%
e4!
e2!

Color is outcome Time edge’s width is
measure.% duration of transition.%

Wongsuphasawat, K., & Gotz, D. (2012).
Exploring Flow, Factors, and Outcomes of Temporal Event Sequences with the Outﬂow Visualization.
IEEE Transactions on Visualization and Computer Graphics, 18(12), 2659–2668.

collection
1 2 n



align
by
time
rank search by
sequence
ﬁlter group
+ Outcome

Application to
Big Data Analysis

Something sounds simple
X
magnitude of big data
=
Big mess
& Big reward

Event Sequence Analysis at
eBay
CheckoutProcStep1
PaymentReview
CheckoutProcStep2
CheckoutProcStep3
PaymentConﬁrm
CheckoutProcStep4
CheckoutProcStep5
CheckoutProcStep6
CheckoutSuccess

eBay

alignment

Shen, Z., Wei, J., Sundaresan, N., & Ma, K.-L. (2012).
Visual analysis of massive web session data.
IEEE Symposium on Large Data Analysis and Visualization (LDAV), 65–72.

Twitter
•  Data
–  TBs of session logs everyday
•  Complexity
–  millions of sessions per day
–  1000+ types of events
–  long sessions
•  Goal
–  Overview of how users are using Twitter
•  Technique
–  LifeFlow
Simplify!

Twitter (2)
•  So far
–  millions of sessions per day
–  millions of sessions on the same screen
–  1000+ types of events
–  simpliﬁed sets of events
•  e.g., pages only, selected pages only
–  long sessions
–  limited session length to 10-20 events

Twitter (3)
Session%Start%
Page%A% Page%B% Page%C%
Page%B% Page%A% Page%D%
Page%C% Page%D% Page%B% Page%C%
Page%D% Page%C%

*fake data

Twitter (4)
•  Implementation
–  Hadoop
–  Web-based (js)
•  More
–  Stored preprocessed data in smaller db
(MySQL/Vertica)

Interactive

MySQL /
HDFS Vertica Visualization

Batch pig scripts

Takeaway Messages
•  Life is full of event sequences.

•  How to visualize an event sequence

krist.wongz@gmail.com

@kristw

Takeaway Messages

•  How to visualize collection of event seq.


@kristw

Takeaway Messages

•  How to visualize collection of event seq.
•  Applicable to big data
•  New techniques happen everyday.

@kristw

Smurf Communism - Wikipedia
delete keep
…

http://notabilia.net/

http://www.evolutionoftheweb.com

Visualization for Event Sequences Exploration

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (12)

Similar to Visualization for Event Sequences Exploration

Similar to Visualization for Event Sequences Exploration (16)

More from Krist Wongsuphasawat

More from Krist Wongsuphasawat (20)

Recently uploaded

Recently uploaded (20)

Visualization for Event Sequences Exploration