10. Krist Wongsuphasawat /@kristw
Computer Engineer
Bangkok, Thailand
PhD in Computer Science
Univ. of Maryland
Information Visualization
IBM
Microsoft
Data Visualization Scientist
Twitter
15. Krist Wongsuphasawat & Jimmy Lin
@kristw
Using visualizations
to monitor changes and harvest insights
from log data at Twitter
@lintool
IEEE VAST 2014
38. Log data
in Hadoop
Aggregate
10,000+ event types
date client page section comp. elem. action count
20141011 web home home - - impression 100
20141011 web home wtf - - click 20
Engineers & Data Scientists
Client event collection
39. Log data
in Hadoop
Aggregate
10,000+ event types
date client page section comp. elem. action count
20141011 web home home - - impression 100
20141011 web home wtf - - click 20
Engineers & Data Scientists
Client event collection
(Who-to-Follow)
44. client page section component element action
Search
Find
Log data
in Hadoop
Aggregate
web home * * impression*
Client event collection
Engineers & Data Scientists
45. client page section component element action
Search
Find
Query
Return
Log data
in Hadoop
Results
web : home : home : - : - : impression
web : home : wtf : - : - : impression
Aggregate
web home * * impression*
Client event collection
Engineers & Data Scientists
46. client page section component element action
Search
Find
Query
Return
Log data
in Hadoop
Results
web : home : home : - : - : impression
web : home : wtf : - : - : impression
Aggregate
search can be better
Client event collection
Engineers & Data Scientists
47. client page section component element action
Search
Find
Query
Return
Log data
in Hadoop
Results
web : home : home : - : - : impression
web : home : wtf : - : - : impression
Aggregate
10,000+ event types
search can be better
Client event collection
Engineers & Data Scientists
48. client page section component element action
Search
Find
Query
Return
Log data
in Hadoop
Results
web : home : home : - : - : impression
web : home : wtf : - : - : impression
Aggregate
search can be better
10,000+ event types
not everybody knows
What are all sections under web:home?
Client event collection
Engineers & Data Scientists
49. client page section component element action
Search
Find
Query
Return
Log data
in Hadoop
Results
web : home : home : - : - : impression
Aggregate
search can be better
one graph / event
10,000+ event types
not everybody knows
What are all sections under web:home?
Client event collection
Engineers & Data Scientists
50. client page section component element action
Search
Find
Query
Return
Log data
in Hadoop
Results
web : home : home : - : - : impression
Aggregate
search can be better
one graph / event
x 10,000
10,000+ event types
not everybody knows
What are all sections under web:home?
Client event collection
Engineers & Data Scientists
57. See
How to visualize?
narrow down
Client event collection
Engineers & Data Scientists
Interactions
search box => filter
58. See
How to visualize?
narrow down
Client event collection
Engineers & Data Scientists
client : page : section : component : element : actionInteractions
search box => filter
74. Funnel analysis
banana : home : - : - : - : impression
banana : profile : - : - : - : impression banana : search : - : - : - : impression
home page
profile page search page
Specify all funnels manually!
n jobs
n hours
75. Goal
banana : home : - : - : - : impression
… ……
1 job => all funnels, visualized
home page
76. • Visualize an overview of event sequences
!
Related work
[Wongsuphasawat et al. 2011, Monroe et al. 2013, …]
77. • Visualize an overview of event sequences
!
• Big data? eBay checkout sequences
!
One funnel at a time
Checkout > Payment > Confirm > Success
Related work
[Wongsuphasawat et al. 2011, Monroe et al. 2013, …]
[Shen et al. 2013]
110. 1. Define set of events
2. Pick alignment, direction and window size
3. Run Hadoop job (with more aggregation)
4. Wait for it… (2+ hrs)
5. Visualize
Final process
~100,000 patterns (10MB)
gazillion patterns (TBs)
112. • Since Jan 2013
• Fewer users, but more in-depth ad-hoc analysis
• Initial meeting to provide support
Deployment
113. • What did users do when they visit Twitter? (in demo)
• Where did users give up in the sign up process?
• more in the paper
Case studies
114. • Large-scale User Activity Logs + Visual Analytics
Conclusions & Future work
115. • Large-scale User Activity Logs + Visual Analytics
• Find, Monitor & Explore
+ Anomaly detection & automatic alert
• Funnel Analysis
+ More interactivity & data / reduce wait time
• Used in day-to-day operations at Twitter
Conclusions & Future work
116. Conclusions & Future work
Challenge
big data
small data
visualize & interact
• Large-scale User Activity Logs + Visual Analytics
• Find, Monitor & Explore
+ Anomaly detection & automatic alert
• Funnel Analysis
+ More interactivity & data / reduce wait time
• Used in day-to-day operations at Twitter
aggregate
& sacrifice
117. • Large-scale User Activity Logs + Visual Analytics
• Find, Monitor & Explore
+ Anomaly detection & automatic alert
• Funnel Analysis
+ More interactivity & data / reduce wait time
• Used in day-to-day operations at Twitter
• Generalize to smaller systems
Conclusions & Future work
Challenge
big data
small data
visualize & interact
aggregate
& sacrifice
118. • Data Scientists & Engineers @Twitter — Linus Lee, Chuang Liu
• Feedback from reviewers, Ben Shneiderman & Catherine Plaisant
Acknowledgement
119. • Large-scale User Activity Logs + Visual Analytics
• Find, Monitor & Explore
+ Anomaly detection & automatic alert
• Funnel Analysis
+ More interactivity & data / reduce wait time
• Used in day-to-day operations at Twitter
• Generalize to smaller systems
Conclusions & Future work
Challenge
big data
small data
visualize & interact
kristw@twitter.com / @kristw
aggregate
& sacrifice