SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.
SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.
Successfully reported this slideshow.
Activate your 14 day free trial to unlock unlimited reading.
Krist Wongsuphasawat's Dissertation Defense: Interactive Exploration of Temporal Event Sequences
8.
1. Lack of overview
Show overview
or summary
60,041 patients
203,214 traffic incidents
Where should I start?
Is the dataset cleaned? 7,022 web sessions
… and more
9.
2. Approximate search
ICU Floor ICU
QUERY within 2 days
Find something
useful and display.
RESULTS Frustrated!
Found 0 record
10.
Research Questions
Overview Search
How to provide an overview How to support users
of multiple event sequences? when they are uncertain
about what they are looking for?
LifeFlow
Similan
Flexible Temporal Search
11.
Outline
Approximate
Introduction Search Conclusions
LifeFlow Case Studies
Overview
How to provide an overview
of multiple event sequences?
12.
From one event sequence...
• Single record [Cousins91], [Harrison94], [Plaisant98], …
Patient ID: 45851737
12/02/2008&14:26 &Arrival&
12/02/2008&14:26 &Emergency&
12/02/2008&22:44 &ICU&
12/05/2008&05:07 &Floor&
12/08/2008&10:02 &Floor&
12/14/2008&06:19 &Discharge&
& Time
Patient #45851737 Arrival
Emergency Room
ICU
Floor
Discharge
compact
21.
1
#
time
#1&
Event Sequences
#2&
n records
#3&
…&
1,000,000
Aggregate O(n)
Tree of Sequences
α" No. of patterns
9 nodes
Represent
time
records
LifeFlow
Visual Representation
Space-filling technique
Average time Event Bar End Node
23.
User Study
xxxxx 12-minute
yyyyy
10 participants
training
15 tasks
Participants could perform the tasks
accurately and rapidly.
24.
Quotes
“ Oh! This is very cool! ”
“ Theunderstand
easy to
tool is “ LifeFlow provides
a great summary
and easy to use.! ” of the big picture.! ”
“ find common
Very easy to
“ Can I use it
and uncommon
sequences!
with my dataset? ”
”
26.
Outline
How to support users when they are uncertain
about what they are looking for?
Approximate
Introduction Search Conclusions
LifeFlow Case Studies
Overview
Similarity Search Hybrid Search
27.
Related Work: Exact Match
Exact Match • Event Sequence
MUST have A, B, C – TimeSearcher
[Hochheiser04]
Query – PatternFinder
[Fails06]
– LifeLines2
Record#1
[Wang08]
– ActiviTree
Record#2 [Vrotsou09]
– QueryMarvel
Record#3 [Jin09]
28.
Related Work: Similarity Search
• Image Similarity Search
[Kato92] SHOULD have A, B, C
• Stock Price
[Wattenberg01] Query more"
similar!
• Web page
[Watai07] Record#2 0.91
• Bank account
[Chang07] Record#1 0.83
• Event Sequence?
Record#3 0.70
29.
Challenges
What is similar?
depends on users/tasks
Query Record #1
A! B! C!
Record #2 missing
A! B! C!
Record #3 extra
A! B! C! D!
Record #4
A! B! time difference C!
Record #5 swap
A! C! B!
30.
Match & Mismatch (M&M) Measure
Time
Query Record #1
A! C! B! D!
Record #2
A! B! C! E!
Matched events Missing Extra
}
Time difference
Number of swap Total Score
Number of missing events 0.00-1.00
Number of extra events
31.
2
#
Similarity Search
Similarity Measure
Match & Mismatch + User Interface
Similan
What is similar?! Specify query / Display results!
Version 1
xxxxyyyy
Version 2
42.
Similarity Vector s(i,j)
• No. of matched events (mandatory)
• No. of matched events (optional)
• No. of negations violated (optional)
• No. of negations violated (mandatory)
• No. of time constraints violated
• Time difference
• No. of extra events
– Extra before the first match
– Extra between the first and last match
– Extra after the last match
47.
MILCs
# Domain Data Size Duration
1 Medical 7,041 7 months
2 Transportation 203,214 3 months
3 Medical 20,000 6 months
4 Medical 60,041 1 year
5 Web logs 7,022 6 weeks
6 Activity logs 60 5 months
7 Logistics 821 6 weeks
8 Sports 61 5 weeks
8 case studies / 6 domains
48.
Case #1: Medical
User: Dr. A. Zach Hettinger
MedStar Institute for Innovation
mi2.org
Data: 60,041 patients
Task: Hospital readmissions
49.
Current Report
Patient Diagnosis Visit Date Physician Visit Date Physician
#1 #1 #2 #2
Mr. X Back pain Jun 10, 2010 Dr. Jones Jun 29, 2010 Dr. Brown
Mr. Y Chest pain Jun 11, 2010 Dr. Jones Jun 20, 2010 Dr. Jones
… … … … … …
An example of current report used in a hospital (fake data)
How many patients came back?
Did they come back for the 3rd, 4th, … time?
How many came back and died?
…
50.
60,041 patients How many patients came back?
Did they come back for the 3rd, 4th, … time?
Registration
51.
60,041 patients
Registration How many came back and died?
Death
52.
60,041 patients
Location
Registration
Admission
Death
53.
60,041 patients Find a pattern:
Registration > Discharge > Registration > Death
Registration
Discharge
Death
54.
60,041 patients Find a pattern:
Registration > Discharge > Registration > Death
Registration
Discharge
Death
55.
Analyzing data in a new way
Personal exploration
Long-term monitoring
Save more lives!
56.
Case #2: Transportation
User: CATT Lab at the University of Maryland
www.cattlab.umd.edu
Data: 203,214 traffic incidents
Task: Comparing traffic agencies’ performance
62.
Case #3: Web logs
User: Anne Rose
International Children’s Digital Library
www.childrenslibrary.org
Data: 7,022 sessions
Task: How do people read children books online?
PAGE 1 PAGE 2 PAGE 3 …
65.
Understand data
Surprising pattern
New hypotheses
66.
Case #4: Sports
User: Daniel Lertpratchya
Manchester United soccer fan
www.manutd.com
Data: 61 soccer matches
Task: Find interesting matches to watch replay videos.
Explore data to find fun facts.
Begin Score Opponent Score End
67.
Find interesting matches
Begin
Score
Opponent Score
End
70.
Performance: home vs. away
Begin
Score
Opponent Score
Missed Penalty
End
71.
Finding specific situations.
Begin
Score
Opponent Score
Missed Penalty
End
72.
4
#
Design Guidelines
Align-Rank-Filter Handle event types Incorporate attributes
Breakfast
Lunch } Meal
Multiple levels Multiple overviews Coordinated views
of information
Overview
Record
Event
Search Data preprocessing History / Provenance
73.
Outline
Approximate
Introduction Search Conclusions
LifeFlow Case Studies
Overview
74.
Contributions
1. How to provide an overview of multiple event sequences?
# 1
LifeFlow Visualization
Aggregation, Visual encodings & Interactions
2. How to support users when they are uncertain about
what they are looking for?
#2 # 3
Similarity Search Hybrid Search
Similan + Match & Mismatch Flexible Temporal Search
4
#
Case Studies + Design Guidelines
75.
Future Directions
Outflow
Improve the New tasks:
visualization & UI: comparison,
colors, gaps, … attributes in query, …!
More complex data: Scalability:
stream, interval database,
concurrency, …! cloud computing, …
76.
Outline
Approximate
Introduction Search Conclusions
LifeFlow Case Studies
Overview
77.
Outline
Approximate
Introduction Search Conclusions
LifeFlow Case Studies
Overview
This is an event sequence!
80.
Acknowledgement
Washington Hospital Center
Dr. A. Zach Hettinger , Dr. Phuong Ho and Dr. Mark Smith
National Institutes of Health
Grant RC1CA147489-02
Center for Integrated Transportation Systems Management
a Tier 1 Transportation Center at the University of Maryland
Study Participants
Advisors, Committees, HCIL Colleagues
81.
Contributions
1. How to provide an overview of multiple event sequences?
LifeFlow Visualization
Aggregation, Visual encodings & Interactions
2. How to support users when they are uncertain about
what they are looking for?
Similarity Search Hybrid Search
Similan + Match & Mismatch Flexible Temporal Search
Case Studies + Design Guidelines
http://www.cs.umd.edu/hcil/lifeflow kristw@cs.umd.edu / @kristwongz