Computational Rationality I - a Lecture at Aalto University by Antti Oulasvirta

Computational
Rationality I
Aalto University course CS-E4070
Antti Oulasvirta , Associate Professor
userinterfaces.aalto.fi
March 12, 2018

About the speaker
A cognitive scientist leading the User Interfaces
group at Aalto University
Userinterfaces.aalto.fi
...in order to improve
user interfaces
Modeling joint performance of
human-computer interaction
... and developing new principles
of design and intelligent support

You’re in a traffic jam. Do you continue,
or exit and find another route?

What determines a hiker’s route?

Complexities of real-world tasks
To achieve human-level flexibility and adaptivity, we must solve:
1. Generalization: Going from previous episodes to an unseen one
2. Latent learning: Adapting to distal changes in environment
3. Planning: Sequencing actions while considering long-term
effects on reward
4. Compositionality: Good solutions require putting together partial
solutions in a clever way
5. Exploration/exploitation: Knowing when to learn the structure of
a task or environment vs. when to exploit it
6. Uncertainty: Knowledge can be incomplete or incorrect
7. Resource limitations: Limited time and capabilities
8. Curse of dimensionality: A very large number of possibilities
6

Computational rationality is
the study of computational
principles of intelligence in
living and artificial beings.

In particular, it looks at
intelligence as rational
behavior...

Overview
Computational rationality converges ideas from AI, robotics,
cognitive science, and neurosciences
It refers to computational principles for
1. “identifying decisions with highest expected utility, while taking into
consideration the costs of computation in complex real-world
problems in which most relevant calculations can only be
approximated.“
(Gershman et al. 2015 Science)
2. implementing bounded optimality in humans
(Lewis et al. 2014 Topics in Cog Sci)
The two definitions discussed in this lecture

Computational rationality is HARD
The involved problems are computationally hard (in a way the
point is to explain them)
Theories must not only produce intelligent-looking behavior
(as in AI), but be
• cognitively and neurally plausible
• supported by empirical data
Computational Rationality I – Antti Oulasvirta March 12, 2018
10

Why computational rationality?
Powerful computational principles that both explain human-like
adaptivity and generate it
• Key capability: Understand adaptive behavior considering the joint
influences of environment, objectives, and capabilities
Applications:
1. Machine learning and AI: Avoid overfitting; increase
interpretability; New principles for adaptivity and learning
2. Cognitive science: Avoid mistaking an adaptive capacity for a
fixed mechanism
3. Neurosciences: Link between neural, cognitive, and behavioral
explanations of human mind
4. Human-computer interaction: Adapting and designing while
taking adaptive human capabilities in to account

12
Personal note on how
revolutionary this is for HCI

This lecture looks at computational rationality from cognitive
and neuroscientific viewpoint
Lecture outline
Human mind is
computational
Human mind is
rational
Human mind is
computationally
adaptive
Bounded
agents
Reinforcement
learning
Human mind is
adaptive
Background
Basic ideas
Some discovered principles
Revisit: From a
neuroscience
point-of-view
A generalized
view for cognitive
sciences

This lecture is based on three papers
We assume familiarity with model-free and
model-based RL from Prof Kyrki’s talk

Scope of this lecture
This lecture provides an overview of intellectual history, core
problems and concepts, and recent achievements. Examples are
given but details deferred
Next week’s lecture “Computational Rationality II” zooms into
two topics:
• Theory of Mind
• POMDP
• Emotions (if there’s time)
15

Human mind is
computational
Human mind is
computational
Human mind is
rational
Human mind is
computationally
adaptive
Bounded
agents
Reinforcement
learning
Human mind is
adaptive
Lecture outline

Common assumptions of the
information processing view of mind
I. Cognitive processes consist of the transmission of
information through a series of stages (serial) in which
information is transformed in order to achieve a goal
II. Higher mental processes are understood as the
collective action of elementary process. Processes
occur independently and they can be isolated
III. Human cognition has a limited capacity for storing
and transmitting information
17

Two directions of research
1. Full-fledged cognitive architectures that
describe the mind’s information
processing flow and bounds
2. Cognitive algorithms/tricks that simplify
complex problems
18

1. Information processing architectures
Emerged as a computational framework by which researchers can
build models for particular tasks and run them in simulation to
generate cognition and action.
Akin to a programming language where the constraints of the
human system are also embedded into the architecture.
A number of architectures over the past decades, such as ACT-R
(Anderson, 2007), Soar (Laird, Newell, & Rosenbloom, 1987; Laird,
2012), EPIC (Meyer & Kieras, 1997), and others.
19Brumby et al. 2018 in Computational Interaction, OUP

David Marr:
Cascade of
computations
that enable
perceptual
organization
from retinal
features
(primal sketch)

Cognitive architectures
(“boxologies”)
Example: Wickens & Hollands 1999
21

A limitation addressed by CR
Adaptive behavior does not emerge but is mostly prescribed by
researchers (exceptions exist, e.g. in ACT-R)
22
Example of a researcher-given task recipe:
Brumby et al. 2018 in Computational Interaction, OUP

Video: Distract-r
Dario Salvucci

2. Simplifying computational principles
of human mind
“Via evolution the brain has achieved a
remarkable ability to solve complex
problems quickly and energy-
efficiently by means of simplified
processing principles, imposing its
own rules on it, and using its past
experiences.”
24

Example: Time-to-contact estimation
25
Rushton & Wann 1999 Nature

Example: Time-to-contact estimation
Time-to-collision can be estimated with a simple
formula from retinal input
26
Rushton & Wann 1999 Nature
cal evidence2 for theearly combination of sizeand disparity
motion signals(dq/dt + da/dt), and neurophysiological evidence
for thecombination of opticsizeand disparity(q + a) at an early
stageof visual processing15. A TTC estimatecan bebased on a
ratio of thesecombined inputs:
TTCdipole = (q + a) / (dq/dt + da/dt) (4)
Weadopt thelabel ‘dipole’ from theoryon textureperception16 .A
singlepoint viewedbytwoeyesspecifiesabinocular dipole,andtwo
points(such astwooppositeedgesof an object) viewedbyoneeye
specifyamonocular dipole. Our model sumsdipolesand doesnot
distinguish their origin.Analternativemeansof estimatingtheratio
in Equation 4isto takethechangein thesummed dipolelength
withinalogarithmiccoordinatesystem:TTCdipole= d[ln (q + a)]/dt.
Theretinoto
approaches(
theearlier ar
formableobje
tion 4isequi
TTCdipole = (
Hencewhen
distance(I =
weightingof
betoward an
toward TTCd
matereliesu
changeisbe
respon
thresho
biasesTemporal error with looming TTC plateau is 750 ms

Example: Motor control
with muscle synergies
Instead of coordinating muscles separately, we learn to control
muscle groups. This collapses the problem to a lower-dimension
27
Cheung et al. 2012 PNAS
Ting & McKay 2007 Cur op Neur Biol

Challenge #1 for computational
rationality
Information processing views do
not describe the adaptive
properties of mind.
The agent either fails or succeeds
in achieving a goal but does not
adapt or reorganize itself
accordingly without explicit
instruction to do so
28

Human mind is
rational
Human mind is
computational
Human mind is
rational
Human mind is
computationally
adaptive
Bounded
agents
Reinforcement
learning
Human mind is
adaptive
Lecture outline

Rational analysis and utility
maximization view of human mind
The mind is adapted to its environment. Thus, to understand
cognition, we need to study the utility/reward structure of tasks and
environment:
1. Goals: Specify precisely the goals of the cognitive system
2. Environment: Develop a formal model of the environment to
which the system is adapted
3. Optimization: Derive the optimal behavioral function given 1-3
above
30
Wikipedia
Long history in economics and psychology

Satisficing and bounded rationality
31
People were found to be “suboptimal” in many tasks
Herbert Simon’s
satisficing

History of bounded rationality
RobotEconomics

Are people “intuitive statisticians”?
People were found not to follow
Bayesian decision theory in verbally
given statistical reasoning tasks,
showing neglects and fallacies
Led to proposal of informal
heuristics and biases as decision-
making principles

rationality
If brain is adapted to compute rationally with
bounded resources, reasoning fallacies follow
naturally from optimizations
“Optimal behavior” does not mean our lay
notion of optimality. Behavior is optimal in light
of organismic objectives and external
environment
34

Human mind is
adaptive
Human mind is
computational
Human mind is
rational
Human mind is
computationally
adaptive
Bounded
agents
Reinforcement
learning
Human mind is
adaptive
Lecture outline

Humans show tremendous capability to
adapt and optimize behavior
Perception
Attention
Procedural memory
(e.g., bicycling)
Episodic memory
(memory for events)
Declarative memory
(memory for facts)

Bayesian brain hypothesis
The brain operates in “situations of uncertainty in a fashion that
is close to the optimal prescribed by Bayesian statistics”
Demonstrated e.g. in
• Psychophysics
• Perception
• Attention
• Motor control
38
Example: The brain is claimed to use Bayes rule
to derive optimal timing decisions based on
compromised visual information

Example: Visual statistical learning
39
adapted to prior
Gaze distribution on a novel
page is driven by
expectations of locations
based on previous
pages

Example: Ecological accounts of
adaptive nature of long-term memory
Schooler andAnderson(1989)
memory…isadaptedtoneedsprobability…

rationality
In many sensori-motor-cognitive tasks,
the brain shows Bayesian-like abilities,
being able to predict under uncertainty
and “repair missing data”. The brain
adapts to experienced contingencies in
the world
41

Computational
rationality
Gershman, Horvitz, & Tenembaum (2015) Science
Human mind is
computational
Human mind is
rational
Human mind is
computationally
adaptive
Bounded
agents
Reinforcement
learning
Human mind is
adaptive
Lecture outline

Where are we?
Thus far: Research predating computational rationality has
shown computational principles for:
• Rational decision-making and reasoning
• Adaptive cognitive and sensorimotor abilities
Computational rationality brings these together to simulate how
intelligent agents can reconfigure their behavior flexibly in
complex real-world problems

Definition of computational rationality
“Computing with representations, algorithms, and
architectures designed to approximate decisions with
the highest expected utility, while taking into account
the costs of computation.”
44
Gershman, Horvitz, & Tenembaum (2015) Science
Models build on “inferential
processes for perceiving,
predicting, and reasoning
under uncertainty”

Three central themes
1. Maximization of expected utility (MEU) as a general purpose
ideal for decision-making under uncertainty
2. Approximating MEU is necessary, because estimation of MEU
is non-trivial for most real-world problems
3. The choice of how to approximate it is itself a decision
subject to utility calculus
Breakthroughs started to emerge after probabilistic graphical
models...

1. When to stop computing?
Estimating time-critical losses with continuing computation
46

2. Resource-constrained sampling

3. Trade-off among cognitive systems
48

Problem: One-shot concept learning
49
Lake et al. (2015) Science
”The model represents concepts as
simple programs that best explain
observed examples under a Bayesian
criterion. On a challenging one-shot
classification task, the model achieves
human-level performance while
outperforming recent deep learning
approaches. We also present several
“visual Turing tests” probing the model’s
creative generalization abilities, which in
many cases are indistinguishable from
human behavior.”

4. Probabilistic program induction
50
Lake et al. (2015) Science

Summary
Recent breakthroughs have found
new ways to approximate MEU e.g. in
reinforcement learning and by using
probabilistic graphical models
But these do not sum up to a unified
view of CR. What we have is a loose
goal and a set of principles...
51

Bounded agents
Lewis, Howes, Singh 2014 Topics in Cog Sci
Human mind is
computational
Human mind is
rational
Human mind is
computationally
adaptive
Bounded
agents
Reinforcement
learning
Human mind is
adaptive
Lecture outline

Emergence of adaptive behavior
“Interaction emerges in a
system consisting of rewards
and costs (or utilities), actions,
and constraints (e.g., structure
of environment). Adaptation is
exhibited in different strategies
for using a computer.”
53
Howes et al. 2009; Payne & Howes 2013
Capacities
Utilities
Ecology
Space of
possible
behaviors
Space of
reasonable
behaviors
Optimal
behavior

Overview
• Assume that users behave (approximately) to maximize utility
given limits on their own capacity
• Optimality bounded by (1) the environment; (2) utility; and (3) the
user’s capabilities
• People are “bounded agents”
• Optimal behavioral strategies can be estimated using e.g.
reinforcement learning
• No need for hard-wiring task procedures (cf. “old cognitive
models”)

Key assumptions
Bounded optimality: Cognitive mechanisms adapt not only to
the structure but to the human mind/brain itself. Theories of
computational rationality are optimal program problems
55

Definitions
Bounded agent is a machine M with
• OM, a space of possible observations
• AM, a space of possible actions
• PM, a space of programs that can run on the machine
Choosing a program specifies an agent model <M,p>.
Behavior is a history of alternating observations and actions:
An agent is bounded when its behaviors exhibit a subset of all
possible behaviors

Bounded optimal programs
Machine M can be any cognitive or neural model
Utility function gauges goals, tasks, subjective utility
Bounded optimality (Russell & Subramanian 1995):
Set of optimal programs for a machine:
Expectation over a
distribution of
environments
Expectation over
histories in an
environment

Remarks
The cost of finding the optimal program is different than the
cost executing it are different
The optimality of a program is not the same as the optimality of
behavior
Multiple levels of optimality explanations can be identified:
• Ecological optimality
• Bounded optimality
• Ecological-bounded optimality
58

Remark about “human-level”
performance claims in AI
The Atari game-playing DL agent is not solving the same
problem as humans when they play games.
Different observations & actions -> Different bounded programs
59

Three bounded agents in HCI tasks
A visual search agent (Jokinen et al. Proc. CHI 2017)
• Solves a sampling problem: where to gaze when searching a UI
• The optimal bounded program is a strategy for recruiting its own
capabilities to optimally sample the display
A text entry agent (Sarcar et al. IEEE Pervasive Computing 2018)
• Solves a sampling and control problem: where to gaze and where to
move the fingers when entering text
A button-pressing agent (Oulasvirta et al. Proc. CHI 2018)
• solves how to control muscles to press buttons in order to improve
its own precision in activating it in time
• Optimal bounded program is an intrinsic probabilistic model that
tells which muscle signal to send for desired effects (button
activation, temporal precision, muscular effort)

1. Visual sampling
Predicts visual search behavior after a layout has changed
Jokinen et al. Proc. CHI 2017
visual search model predictsvisual search times for new and changed layouts. For a noviceuser without any prior exposureto thelayout,
edicts that of the three elements chosen for this comparison, the salient green element is the fastest to ﬁnd. After learning the locations of
the expert model ﬁnds all fairly quickly. At this point, oneblue element and the green element changeplace. Search times for themoved
arelonger than for thegreen element, becausethemodel remembers thedistinctivefeaturesof thelatter.
Figure 2. On the basis of expected utility, the controller requests atten-
tion deployment to a new visual element from theeye-movement system.
This directs attention to the most salient unattended visible object and
Encod
the tar
jects, i
holds
contro
to one
the pro
in the
feature
where
the siz
visual
a = 0.
and 0.
On the
given
top-do
an obj
other o
of the

1. Visual sampling
Utility learning
Jokinen et al. Proc. CHI 2017
Figure 2. On the basis of expected utility, the controller requests atten-
Encoding an object allows the model to decide whether it is
the target or a distractor. Before themodel can encode any ob-
jects, it needs to attend one. Thefeature-guidance component
holds a visual representation of the environment, and at the
controller’srequest it resolvestherequest to deploy attention
to oneof theobjectsin it. Theattended target isdetermined by
the properties of thevisual objects. Their properties’ presence
in the visual representation is based on their eccentricity. A
feature isvisually represented if its angular size islarger than
ae2
− be, (1)
where eistheeccentricity of theobject (in thesame units as
the size) and a and b are free parameters that depend on the
visual featurein question. Their values, from theliterature, are
a = 0.104 and b = 0.85 for colour, 0.14 and 0.96 for shape,
and 0.142 and 0.96 for size [35].
On thebasis of therepresented visual features, each object is
given a total activation as a weighted sum of bottom-up and
top-down activations. Bottom-up activation isthesaliency of
an object, calculated asthe dissimilarity of its features v to all
other objects of theenvironment, weighted by thesquare root
of the linear distance d between the objects:
objects features

Results: example
Effects of layout change on visual sampling strategy and
therefore search costs

2. Ability-based optimization of text
entry
Design a text entry that allows a user to reach maximum
performance / minimize errors given his/her abilities
65+ users: 7 wpm
with touchscreen
devices

Touch-WLM
66
Visuomotor
strategies
Modeling sensorimotor performance in
text entry

Model parameters represent
idiosyncratic and strategic differences

Optimized designs
69
Baseline Tremor Dyslexia
Significant improvements to typing speed

3. How does the brain achieve control
...of a button?
Oulasvirta et al. Proc. CHI 2018

What happens during a button press?

The problem posed to the brain
Pressing a button requires careful timing and
proportioning of force.
The brain should be able to predict how to press a
new button and, if it fails, how to repair
A DOF problem + A prediction problem
NEUROMECHANIC (written in SMALL CAPS to distinguish from
neuromechanics, thetheory) isacomputational implementa-
tion of these ideas. It can be used asa modeling workbench
for comparing button designs. Its predictions approach an
upper limit bounded by neural, physical, and physiological
factors. Simulating presseswith arangeof button types(linear,
tactile, touch, mid-air), we find evidence for the optimality
assumption. Wereport simulation resultsfor (1) displacement–
velocity patterns, (2) temporal precision and success rate in
button activation, and (3) use of force, comparing with effects
reported in empirical studies [7, 33, 37, 40, 42, 47, 48, 53,
59, 61]. We show how the objective function can be tuned
to simulate a user prioritizing different task goals, such as
activation success, temporal precision, or ergonomics.
Whilethe model isan order of magnitude morecomplex than
thefamiliar approaches, it bears an important benefit: parame-
ter settings arerobust over arangeof phenomena. Thesimula-
tionswerecarried out by changing physically andanatomically
determined parameters, whilekeeping other model parameters
fixed without fitting them to human data. We discuss future
work to extend theapproach to morecomplex domains.
PRELIMINARIES: PARAMETERS OF BUTTON DESIGN
We introduce key properties of three main types of buttons:
physical, touch, and mid-air. This serves as background for
mechanical modeling of buttons in NEUROMECHANIC. We
herefocus on design parameters and postpone discussion of
empirical findings on button-pressing to Simulations.
For thepurposes of this paper, wedefineabutton asan elec-
tromechanical device that makes or breaks a signal when
pushed, then returns to itsinitial (or re-pushable) statewhen
released. It converts a continuous mechanical motion into a
discrete electric signal. Physical keyswitches and touch sen-
sors are common in modern systems. Physical dimensions
(width, slant, and key depth), materials (e.g., plastics), and
TACTILE PUSH-BUTTONS Tactileand “clicky” buttonsoffer more
points of interest (POIs), or changes during press-down and
release. F(B) is called actuation force, which is considered
the most important design parameter. dF(B − C)/F(B) is
called snap ratio and determines the intensity of tactile feel-
ing or ’bump’ of a button. A snap ratio greater than 40% is
recommended for astrong tactilefeeling by rubber-domeman-
ufacturing companies. Most POIsaretunable, yet somepoints
are dependent on other points. With some tactile buttons, a
distinct audible “click” sound may be generated, often near
the snap or makepoints.
TOUCH BUTTONS Touch buttonscan beconsidered azero-travel
button. Consequently, they show lower peak force than physi-
cal buttons do. Because of false activations, thefinger cannot
rest on the surface. Activation is triggered by thresholding
contact area of the pulp of thefinger on thesurface.
MID-AIR BUTTONS Mid-air buttonsarebased not on electrome-
chanical sensing but, for example, on computer vision or elec-
tromyographic sensing. Sincethey arecontactless, they do not
have a force curve. The point of activation is determined by
reference to angle at joint or distance traveled by the fingertip.
Latency and inaccuracies in tracking are known issues with
mid-air buttons.
Figure2. Idealized force–displacement curvesfor linear (left) and tactile
(right) buttons. Green lines are press and blue lines are release curves.
Annotations (A–H) arecovered in thetext.
2
But buttons are black boxes!
Force-displacement curves
of two buttons

Neuromechanics: Predictive control of
a black box
7312.3.2018
“THE BLACK BOX”

Neuromechanics modeling
Intrinsic probabilistic model attempts to take over control of its
own sensations when pressing a button
Figure4. NEUROM ECHANI C isa computational model of neuromechanicsin button-pressing. It implementsaprobabilistic internal model (Gaussian
process regression) that attempts to minimize error between its expected and perceived button activation. Its motor commands are transferred via a
noisy and delayed neural channel to muscles controlling the ﬁnger. A physical simulation of theﬁnger acting on thebutton yields four types of sensory
signals that areintegrated into a singlepercept (p-center) by meansof a maximum likelihood estimator.

Elements of the approach
Probabilistic internal model (Bayesian optimization using GP)
Perceptual control (Predicting the felt consequences of movement)
Neural transmission and muscle activation (Noisy signals)
Movement dynamics (Mechanics modeling)
Multiple noisy sensory signals (Noisy signals)
Probabilistic cue integration (Maximum likelihood estimator)

Let’s look
inside the box
BO). Variables
tual objective
d is a random
process (GP)
aps q and pce
umed to have
bution of the
GPmodel, ob-
and a point is
quisition func-
mand from the
onvergence to
loration slows
g the globally
ous system is

Perceptual control of button activation
information iscompromised.
Figure3. Perceptual control of a button: themotor system hasno access
to the true moment of activation, but it can try to reduce error between
themoment it expected versusit perceives. Left: perceptual control fails.
Right: precise control.

Neuromechanics modeling
NEUROMECHANIC: A COMPUTATIONAL MODEL
NEUROMECHANIC implements these ideascomputationally. It
consists of two connected sub-models (Figure 4).
Objective Function
A motor command q sent to the finger muscles consists of
three parameters:
of p-Centers
nnected to four extero-
oprioception, audition,
oduces ap-center pci.
aneural signal evoked
ceptors. We are espe-
tors on the finger pad
abutton press. Slowly
to coarse spatial struc-
surfaceof thebutton),
ond to motion. Kim
als from the fingertip
d jerk from the finger
and indentation have
correlates highly with
use buttons havelittle
odel to mechanorecep-
ime-varying signal is
sitivecomponents. In
for estimating pco isaweighted average [16, 17]:
pco = Â
i
wi pci where wi =
1/ s 2
i
Âi 1/ s 2
i
(7)
with wi being theweight given to theith single-cue estimate
and s 2
i being that estimate’s variance. Figure 6 shows ex-
emplary p-center calculations: signal-specific (pci) and inte-
grated p-centers (pco) from 100 simulated runs of NEUROME
CHANIC pressing a tactile button. Note that absolute differ-
ences among pci do not affect pco, only signal variances do
The integrated timing estimate isrobust to long delays in, say
auditory or visual feedback. This assumption is based on a
study showingthat physiological eventsthat takeplacequickly
within a few hundred milliseconds, do not tend to be cause
over- nor underestimations of event durations [14].
IMPLEMENTATION AND PARAMETER SELECTION
NEUROMECHANIC is implemented in MATLAB, using
BAYESOPT for Bayesian optimization (GP model uses the
ARD Matern 5/2 kernel), SIMSCAPE for mechanics, and
nicsin button-pressing. It implementsaprobabilistic internal model (Gaussian
d and perceived button activation. Its motor commands are transferred via a
ysical simulation of thefinger acting on thebutton yields four typesof sensory
maximum likelihood estimator.
Objective Function
threeparameters:
q = { µA+ ,t A+ ,sA+ } (1)
pressing. It implementsaprobabilistic internal model (Gaussian
d button activation. Its motor commands are transferred via a
on of thefinger acting on thebutton yieldsfour typesof sensory
ihood estimator.
ve Function
or command q sent to the finger muscles consists of
arameters:
q = { µA+ ,tA+ ,sA+ } (1)
gnal offset µ, signal amplitudet , and duration s of the
(A+) muscle. Wehaveset physiologically plausible
a(min and max) for theactivation parameters.
ectiveisto determine q that minimizes error:
min
q
EP(q) + EA(q) + EC(q) (2)
EP is error in predicting perception, EA is error in ac-
thebutton, and EC iserror in making contact (button
touched). Weassumethat activation and contact errors

ParametersTable 1. Model parameters. Button parameters here given for physical
buttons. Task parameters (e.g., finger starting height) are given in text.
f denotes function
Variable Description Value, Unit Ref.
fr Radius of finger cone 7.0 mm
fw Length of finger 60 mm
r f Density of finger 985 kg/m3
cf Damping of finger pulp 1.5 N·s/m [64]
kf Stiffness of finger pulp f , N/m [65]
wb Width of key cap 14 mm
db Depth of key cap 10 mm
r b Density of key cap 700 kg/m3
cb Damping of button 0.1 N·s/m
ks Elasticity of muscle 0.8·PCSA [38]
kd Elasticity of muscle 0.1·ks [38]
kc Damping of muscle 6 N·s/m [38]
PCSA Phys. cross-sectional area 4 cm2
L0ag, L0an Initial muscle length 300 mm
sn Neuromuscular noise 5·10− 2
sm Mechanoreception noise 1·10− 8
s p Proprioception noise 8·10− 7
sa Sound and audition noise 5·10− 4
sv Display and vision noise 2·10− 2
Figure 7. Data collection on press kinematics: A single-sub
High-fidelity optical motion tracking was used to track a m
the finger nail. A custom-made single-button setup was cre
switches and key capsfrom commercial keyboards.
SIMULATIONS: COMPARING BUTTON DESIGNS
We investigated NEUROMECHANIC in a series of sim
addressing four button types: tactile, linear, touch, an
The tactile button type is one of the most commo
in commercial keyboards. The linear type is a cha
case, because theonly difference isthe ’tactile bump
buttons, on theother hand, arecommon and generall
ered worse than physical button. Mid-air buttons, on
hand, lack mechanoreceptive feedback entirely and
proprioceptivefeedback.
We inspect predictions for displacement–velocity
force–displacement curves, muscle forces, as wel
level measures (perceptual error and button activation
Except for neural
noise parameters,
all parameters are
physically
measurable or
known.
Button-pressing
behavior emerges

Example result: Force-velocity curves
omics.
complex than
nefit: parame-
a. Thesimula-
anatomically
el parameters
discuss future
omains.
DESIGN
es of buttons:
ckground for
CHANIC. We
discussion of
ions.
on asan elec-
signal when
e) state when
motion into a
nd touch sen-
l dimensions
plastics), and
cal buttons do. Because of false activations, thefinger cannot
rest on the surface. Activation is triggered by thresholding
contact area of thepulp of thefinger on the surface.
MID-AIR BUTTONS Mid-air buttons arebased not on electrome-
chanical sensing but, for example, on computer vision or elec-
tromyographic sensing. Sincethey arecontactless, they do not
have aforce curve. The point of activation is determined by
reference to angle at joint or distance traveled by thefingertip.
Latency and inaccuracies in tracking are known issues with
mid-air buttons.
Figure2. Idealized force–displacement curvesfor linear (left) and tactile
(right) buttons. Green lines are press and blue lines are release curves.
Annotations (A–H) arecovered in the text.
2
LINEAR
ysical
n text.
Ref.
[64]
[65]
Figure 7. Data collection on press kinematics: A single-subject study.
High-fidelity optical motion tracking was used to track a marker on

Emulating a light touch
Figure 11. Predicted muscle force–displacement behavior for a tactile
typebutton: without and with an effort-minimizing term in theobjective
function.
task performance (perform
clude, that although much w
support the’optimal black
analysescould done, such a
feedback, oscillation of th
or the effects that impairm
FUTURE WORK
Modeling latent neural and
poses a scientiﬁc challeng
noise parameters has alarg
dynamics downstream. Ho
be activated with arbitrary
sensory noise parameters t
theorder of 1.5·10− 6
s. O
prevent NEUROMECHANIC pushing the button with unrealis-
tically high force, which would in reality cause fatigue and
stress, weintroduceacontrollable ergonomics(or effort) term
to theobjective. Adding tuning factors, theobjectivebecomes:
min
q
wEPEP(q) + wEA
EA(q) + wEC
EC(q) + wFM FM(q) (4)
where FM is muscle force expenditure from the Hill muscle
model (seebelow) and wi aretuning factors. By changing the
weights, themodel can simulate, for example, auser trading
off effort versustemporal precision, or auser not caring about
temporal precision but only about activating thebutton.
4

Comparison among button types
Peak muscle forces 1.7-2.0N for humans
Model: 1.4-1.6N
Mid-air buttons the worst
Confirmed by model
movement control.
In NEUROMECHANIC, the trade-off between force-use and
temporal precision in the objectivefunction is controlled by
the tuning factor wFM . When wFM is set to zero, the peak
muscle forces for a tactile button increases to 2.45 N. The
muscle force–displacement responsespredicted by themodel
Table2. Simulation results four button types
Linear Tactile Touch Mid-air
Perceptual error 47 ms 40 ms 34 ms 178 ms
Std of perc. error 31 ms 26 ms 76 ms 47 ms
Std of activation time 52 ms 43 ms 90 ms 51 ms
Activation success 92% 82% 94% 54%
Peak muscle force 1.65 N 1.41 N 2.6 N 2.9 N

Why are mid-air buttons so unusable?

Downstream effects of design and
system properties
84
NEUROMECHANIC: A COMPUTATIONAL MODEL
NEUROMECHANIC implements these ideascomputationally. It
consists of two connected sub-models (Figure 4).
Objective Function
three parameters:
Mid-air buttons are worse because of the downstream
effects of less reliable sensory feedback

Discussion
Pros: modelling human behavior using computational rationality
• Changes the modeling problem to the definition of
observations, actions, bounds, and optimality principle
• An order of magnitude fewer parameters
(cf. good-old cognitive models)
• Behavioral strategies “emerge”
Challenges:
• What is the right bounded problem (observations, actions)?
• What are the right bounds?
• What is the optimization mechanism?

Reinforcement
learning
Gershman & Daw (2017) Annual Review of Psychology
Human mind is
computational
Human mind is
rational
Human mind is
computationally
adaptive
Bounded
agents
Reinforcement
learning
Human mind is
adaptive
Lecture outline

This part: Revisiting RL from the
perspective of neurosciences
“Reinforcement learning (RL) is the process by which
organisms learn by trial and error to predict and acquire
reward.”
Requirement: Brains must solve reinforcement learning style
problems somehow, as evidenced by their impressive
behavioural performance
Hard: Curse of dimensionality is compounded by sequential
dependency of actions and long-term effects on future reward.

Operant conditioning
Skinner box

Model-free learning
• Model-free learning (e.g., TD) easier to execute as long-run
values are already computed and only need to be compared.
• Adaptive but less appropriate for changing environments.
Fails in latent learning, with distal changes in rewards
• Finding: A procedural learning system in striatum
• The firing rate of dopamine neurons in the ventral tegmental area
(VTA) and substantia nigra (SNc) appear to mimic the error
function in the algorithm. (Schultz et al. 1997 Science)
• Unconscious and cognitively impenetrable (Pessiglione et al. 2008
Neuron)
• Ventral striatum corresponds to “critic” and dorsal to “actor”
(O’Doherty et al. 2004 Science)

Schultz’ 1997 experiment
92

Tolman’s cognitive maps
93
Latent learning experiment

Model-based learning
Model-based RL solves the latent learning problem:
first learning the environment and then the rewards.
Associated to hippocampus in the brain responsible
for episodic and spatial memories.
This discovery led to rejection of model-free RL as
the sole account of RL
94

Integrated models proposed
95
Lee et al. (2014) Neuron
Signatures of both types of learning have been
found in neuroscientific studies

Recognized shortcomings
Scaling up to real-world tasks: Laboratory tasks small and
somewhat artificial
• A handful of states and actions
• Tasks designed to satisfy the Markov conditional independence
property
• Real-world situations offer plenty of extraneous detail that are too
vast and impoverished to serve as states in RL
• States look similar to each other
96

Tip: Status of RL in neurosci
The good
The bad but tractable
The ugly: crucial challenges
97

One line of works extends to other
types of human memory systems...
Based on Larry Squire’s taxonomy 1987

Model of menu search
(Chen et al. CHI’15)
Finds optimal gaze
pattern given menu
design and parameters
of the visual and
cognitive system
100

Inverse
Computational
Rationality
Kangasrääsiö et al. 2017 Proc. CHI

102
12.3.2018
Why did the user click here?

“Algorithmic Sherlock Holmes”

Forward vs. inverse modeling
From model to data (forward) -- from data to model (inverse)
104

Role of inverse modeling for CR
Theory-formation
• CR models need fit with increasingly more important and realistic
datasets (behavioral, neural, cognitive)
Application:
“Why did the user click this”?
 A million dollar question for Internet-based industries
CR models may disentangle the causes of observed behavior
1. Teleological explanations (goals)
2. Capacity explanations (cognitive mechanisms)
3. Ecological explanations (structure of tasks and designs)

Alas: Inverse modeling with human
data is hard
Multiple explanations to any observation
• Different observations can be produced by same mechanism
Stochasticity
Sparse data
Large individual and contextual variability
106Kangasrääsiö et al. Proc. CHI 2107

ABC is a principled way to find optimal
model parameters
Figure 1. This paper studies methodology for inference of parameter
values of cognitive models from observational data in HCI. At the bot-
tom of the figure, we have behavioral data (orange histograms), such as
task solution, only the objecti
straints of thesituation, weca
theoptimal behavior policy. H
that isinferring theconstraints
optimal, isexceedingly difficu
quality and granularity of pre
this inversereinforcement lear
to beunreasonable when often
data exists, such as isoften the
Our application case is a rece
[13]. The model studied here
tation of search behavior, and
completion times, in varioussi
parametric assumptions about
visual system (e.g., fixation dur
Kangasrääsiö et al. Proc. CHI 2107

How ABC works
1. Choose parameter values for the model
2. Simulate predictions
3. Evaluate discrepancy between predictions and observations
4. Use a probabilistic model to estimate discrepancy in
different regions of parameter space
5. (Repeat until converged)

How ABC works
109
Approximate Bayesian Computation (ABC)

How ABC works
110

How ABC works
111

How ABC works
112

How ABC works
113

How ABC works
114
Indicates most likely value and uncertainty

Uses of ABC
Optimal selection and calibration of model for data
1. Model selection (trying out different models)
2. Parameter inference (choosing best parameters)
3. Posterior inference (understanding the space of plausible
explanations)

Case: Menu interaction
Given click times only, predict parameters of HVS
116
See: Kangasrääsiö et al. CHI 2107
Click times

Posterior estimation
ABC yields a posterior distribution for the parameters
117

Explaining individual differences
118

Computational rationality is the
study of computational principles
of intelligence in living and
artificial beings. It looks at
intelligence as rational behavior

Main points
Rational + computational + adaptive = Computational rationality
The study of computational principles the mind uses to adapt
CR unique allows to both generate and infer adaptive behavior
in complex tasks
Hard, because
1. the involved computational problems are high-dimensional
2. humans are complex and partially impenetrable
3. Theories must be plausible neurally and cognitively

An exciting hotspot for attacking
problems at the intersection of AI, ML,
cognitive science, and robotics
Computational rationality directly touches on some of the
hardest problems in psychology and philosophy of mind:
• Connectionist vs. symbolic accounts of mind
• Nature vs. nuture debate
• Strong vs. weak AI and the possibility of general AI
• The roles of consciousness and emotions
Enough exciting topics for several careers...
122

Computational Rationality I - a Lecture at Aalto University by Antti Oulasvirta

More Related Content

What's hot

Similar to Computational Rationality I - a Lecture at Aalto University by Antti Oulasvirta

More from Aalto University

Recently uploaded

Computational Rationality I - a Lecture at Aalto University by Antti Oulasvirta

Editor's Notes