Reference: Himel Dev, and Zhicheng Liu, "Identifying Frequent User Tasks from Application Logs", 22nd ACM International Conference on Intelligent User Interfaces (IUI), Limassol, Cyprus, March, 2017.
3. Objective
Identifying frequent tasks performed by many users
• A task is a set of operations performed together by a user to achieve
a particular goal or milestone.
• E.g., in Photoshop, editing text in an element can be considered as a
task.
4. Applications
Software
Engineering
• Requirement Analysis at Scale
• Bug Identification at Scale
User
Modeling
• Meaningful User Clustering
• User Expertise Modeling
Log
Visualization
• Coarse Visualization Unit
• Noise Elimination
5. Example Logs
Log 1
Open
Image Size
Crop
Save for Web
…
Log 2
Open
New Type Layer
Free Transform
Edit Type Layer
…
Log 3
Open
Canvas Size
Fill Layer
Link Mask
…
6. Challenges
Volume and Complexity
of Data
High Cardinality Event-Set
E = {e1, e2, e3, …, em}
Arbitrarily Long Sessions
Sj = [ej1
, ej2
, ej3
, …, ejn
]
Diversity and Error in
User Behavior
Task Equivalence
[e1, e3, e4] ~ [e1, e4, e3]
Unintended Operations
Sj = [ej1
, ej2
, ej3
, …, ejn
]
10. A user may execute a required operation multiple times within
the duration of a task.
Assumption A2
T1 T2
11. A user may perform multiple tasks in a single session.
Assumption A3
12. To perform a task, a user executes the corresponding
operations contagiously, with no or few outliers.
Assumption A4
T1 T2
13. State-of-the-Art
Method A1 A2 A3 A4
Frequent Itemset √ √ × ×
Sequential Pattern × √ × ×
Cohesive Intemset √ × √ √
Frequent Episode × × × ×
Existing cohesion sensitive patterns measure cohesion based
on the length of occurrence window(s).
Order
Sensitive
Cohesion
Sensitive
14. Minimum Length Occurrence Window
wP,S
(L-) : The minimum length interval(s) within sequence S that
contains pattern P
wP,S
(L-)
= argwP,S
min L(wP,S)
Sj : A X Y B Z C D E A A A B B B B B B C C CPi = {A, B, C}
15. Outlier Based Minimum Occurrence Window
wP,S
(O-) : The interval(s) within S that contains P, while
containing minimum possible outliers
wP,S
(O-)
= argwP,S
min O(wP,S)
Sj : A X Y B Z C D E A A A B B B B B B C C C
Length = 6
# of Outliers = 3
Length = 8
# of Outliers = 0
Pi = {A, B, C}
16. Min Outlier Based Max Occurrence Window
wP,S
(O-) : The maximum length interval(s) within S that contains
P, while containing minimum possible outliers
wP,S
(O-)(L+)
= argwP,S
(O-)max L(wP,S
(O-)
)
Sj : A X Y B Z C D E A A A B B B B B B C C C
Length = 6
# of Outliers = 3
Length = 12
# of Outliers = 0
Pi = {A, B, C}
17. Outlier Based Cohesion Metrics
• Outlier based minimum
occurrence window captures
true cohesion for tasks.
Takeaway
1
• Minimum outlier based
maximum occurrence window
captures task boundaries.
Takeaway
2
18. Evaluation
• Mining, and ranking itemsets
based on cohesion metrics
Itemset
Ranking
• Sampling itemsets, and
rating by expert users
User
Study
19. Top 16 vs Remaining
Proposed Ranking State of the Art Ranking
21. Work in Progress
• A probabilistic model to capture the reported assumptions, along
with the underlying dynamics of operations
• A visual analytics tool to explore user tasks in an interactive manner
22. Conclusion
• We formulate the frequent task identification problem using a set of
example driven assumptions.
• We propose a novel outlier based cohesion metric to capture true
cohesion of operations within a potential task.
• We conduct a user study using Photoshop logs to determine the
effectiveness of our approach.