This document proposes using reinforcement learning to automate keystroke-level modeling (RL-KLM). Keystroke-level modeling is traditionally used to predict task completion times by modeling user behavior as sequences of independent operators like pointing, clicking, etc. The authors represent the user interface as a Markov decision process that can be solved using reinforcement learning to find an optimal operator sequence. They demonstrate RL-KLM for cases like controlling a remote, selecting modalities on an alarm, and filling out a form. The approach could enable automated user interface evaluation and optimization by finding designs that are simple, fast and consistent.
3. Related Work
Model-based evaluation
• GOMS, KLM models used as
evaluation functions
• E.g. CogTool, STEM
• Demonstrations required
[https://cogtool.wordpress.com]
[http://stem.lille.inria.fr]
!3
Reinforcement learning in
cognitive models
• Model learns a policy of to use a UI
• Case specific (e.g. text entry)
[Jokinen et al. 2017] [Chen et al. 2015]
Inverse Reinforcement Learning
• Learns reward functions from
observation
• Required data
[Brochu et al. 2010]
5. Keystroke Level Model
Predicts task completion time.
Behaviour as a sequence of independent operators.
!5
Mental operator
1.35s
Pointing
1.1s
Pressing
1.5s
System response
1.5s
[Card et al. 1980]
Traditionally, sequence is handcrafted
6. KLM as MDP
Markov decision process
provides a mathematical
framework for decision making.
MDP's policy, KLM's sequence,
can be solved with
Reinforcement Learning.
UI can be represented by a state-
action simulator.
!6
Agent
Interface
ActionReward
State
7. Reinforcement Learning
!7
Reward
• Finish the maze
• Penalty from
each used action
Learner
Learning policy from trial and error
by interacting with environment to
maximise the cumulative reward
• Policy defines which action agent
performs in the current state.
Environment
8. RL-KLM
Finds a KLM operator sequence which
minimises task completion time.
!8
Q-Learning
KLM operator
Duration of the operator
State of UI
11. Case 1: Remote Controller
Task:
• Switch to a channel
• Select from two button types (blue
or green)
Proof of concept
• Time optimal policy
• Selected button depends on the
distance between channels
• If distance > 3 : blue
!11
12. Case 2: Multimodal Alarm
Problem
• Select modality to go to the goal state.
• Some modalities are inaccurate.
Policy accounts for the recognition
errors.
!12
Gestures are fastest to use,
Speech the second fastest, and
Tactile the slowest.
13. Case 3: GUI - Form filling
Task:
• Visit all states
Suited for spatial tasks: finds
the fastest path to visit the
items.
!13
16. Optimization
Objectives:
Simple, Fast, Consistent
• Trade-off between simple and fast
• Consistency: logical structure
UI modeled with
Finite State Machine
Design space and the tasks are
automatically generated.
!16
Simplest design
Balanced design
Fastest design
17. Conclusion
• KLM is a general model that can be automated with
Reinforcement Learning
• RL-KLM: Finds a policy that minimizes the task completion time.
• Initial results for simple cases.
• Possible applications: Evaluation, Optimization
Demo and codes for all experiments:
https://github.com/aalto-speech/rl-klm
!17
19. References
- Jokinen, Jussi PP, et al. "Modelling learning of new keyboard layouts."
Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems.
ACM, 2017.
- Chen, Xiuli, et al. "The emergence of interactive behavior: A model of rational
menu search." Proceedings of the 33rd Annual ACM Conference on Human Factors in
Computing Systems. ACM, 2015.
- Brochu, Eric, Vlad M. Cora, and Nando De Freitas. "A tutorial on Bayesian
optimization of expensive cost functions, with application to active user modeling
and hierarchical reinforcement learning." arXiv preprint arXiv:1012.2599 (2010).
- Card, Stuart K., Thomas P. Moran, and Allen Newell. "The keystroke-level model
for user performance time with interactive systems." Communications of the ACM
23.7 (1980): 396-410.
!19