Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Computational Rationality I - a Lecture at Aalto University by Antti Oulasvirta

This 2-hour lecture looks at the emerging field of Computational Rationality. Lecture given March 12, 2018, for the Aalto University Master's level course on "Probabilistic Programming and Reinforcement Learning for Cognition and Interaction." Based on: Gershman et al 2015 Science, Lewis et al 2014 Topics in Cog Sci, and Gershman & Daw 2017 Annu Rev Psych

  • Login to see the comments

Computational Rationality I - a Lecture at Aalto University by Antti Oulasvirta

  1. 1. Computational Rationality I Aalto University course CS-E4070 Antti Oulasvirta , Associate Professor userinterfaces.aalto.fi March 12, 2018
  2. 2. About the speaker A cognitive scientist leading the User Interfaces group at Aalto University Userinterfaces.aalto.fi ...in order to improve user interfaces Modeling joint performance of human-computer interaction ... and developing new principles of design and intelligent support
  3. 3. Recent book
  4. 4. You’re in a traffic jam. Do you continue, or exit and find another route?
  5. 5. What determines a hiker’s route?
  6. 6. Complexities of real-world tasks To achieve human-level flexibility and adaptivity, we must solve: 1. Generalization: Going from previous episodes to an unseen one 2. Latent learning: Adapting to distal changes in environment 3. Planning: Sequencing actions while considering long-term effects on reward 4. Compositionality: Good solutions require putting together partial solutions in a clever way 5. Exploration/exploitation: Knowing when to learn the structure of a task or environment vs. when to exploit it 6. Uncertainty: Knowledge can be incomplete or incorrect 7. Resource limitations: Limited time and capabilities 8. Curse of dimensionality: A very large number of possibilities 6
  7. 7. Computational rationality is the study of computational principles of intelligence in living and artificial beings.
  8. 8. In particular, it looks at intelligence as rational behavior...
  9. 9. Overview Computational rationality converges ideas from AI, robotics, cognitive science, and neurosciences It refers to computational principles for 1. “identifying decisions with highest expected utility, while taking into consideration the costs of computation in complex real-world problems in which most relevant calculations can only be approximated.“ (Gershman et al. 2015 Science) 2. implementing bounded optimality in humans (Lewis et al. 2014 Topics in Cog Sci) The two definitions discussed in this lecture
  10. 10. Computational rationality is HARD The involved problems are computationally hard (in a way the point is to explain them) Theories must not only produce intelligent-looking behavior (as in AI), but be • cognitively and neurally plausible • supported by empirical data Computational Rationality I – Antti Oulasvirta March 12, 2018 10
  11. 11. Why computational rationality? Powerful computational principles that both explain human-like adaptivity and generate it • Key capability: Understand adaptive behavior considering the joint influences of environment, objectives, and capabilities Applications: 1. Machine learning and AI: Avoid overfitting; increase interpretability; New principles for adaptivity and learning 2. Cognitive science: Avoid mistaking an adaptive capacity for a fixed mechanism 3. Neurosciences: Link between neural, cognitive, and behavioral explanations of human mind 4. Human-computer interaction: Adapting and designing while taking adaptive human capabilities in to account
  12. 12. Computational Rationality I – Antti Oulasvirta March 12, 2018 12 Personal note on how revolutionary this is for HCI
  13. 13. This lecture looks at computational rationality from cognitive and neuroscientific viewpoint Lecture outline Human mind is computational Human mind is rational Human mind is computationally adaptive Bounded agents Reinforcement learning Human mind is adaptive Background Basic ideas Some discovered principles Revisit: From a neuroscience point-of-view A generalized view for cognitive sciences
  14. 14. This lecture is based on three papers We assume familiarity with model-free and model-based RL from Prof Kyrki’s talk
  15. 15. Scope of this lecture This lecture provides an overview of intellectual history, core problems and concepts, and recent achievements. Examples are given but details deferred Next week’s lecture “Computational Rationality II” zooms into two topics: • Theory of Mind • POMDP • Emotions (if there’s time) Computational Rationality I – Antti Oulasvirta March 12, 2018 15
  16. 16. Human mind is computational Human mind is computational Human mind is rational Human mind is computationally adaptive Bounded agents Reinforcement learning Human mind is adaptive Lecture outline
  17. 17. Common assumptions of the information processing view of mind I. Cognitive processes consist of the transmission of information through a series of stages (serial) in which information is transformed in order to achieve a goal II. Higher mental processes are understood as the collective action of elementary process. Processes occur independently and they can be isolated III. Human cognition has a limited capacity for storing and transmitting information 17
  18. 18. Two directions of research 1. Full-fledged cognitive architectures that describe the mind’s information processing flow and bounds 2. Cognitive algorithms/tricks that simplify complex problems Computational Rationality I – Antti Oulasvirta March 12, 2018 18
  19. 19. 1. Information processing architectures Emerged as a computational framework by which researchers can build models for particular tasks and run them in simulation to generate cognition and action. Akin to a programming language where the constraints of the human system are also embedded into the architecture. A number of architectures over the past decades, such as ACT-R (Anderson, 2007), Soar (Laird, Newell, & Rosenbloom, 1987; Laird, 2012), EPIC (Meyer & Kieras, 1997), and others. 19Brumby et al. 2018 in Computational Interaction, OUP
  20. 20. David Marr: Cascade of computations that enable perceptual organization from retinal features (primal sketch)
  21. 21. Cognitive architectures (“boxologies”) Example: Wickens & Hollands 1999 21
  22. 22. A limitation addressed by CR Adaptive behavior does not emerge but is mostly prescribed by researchers (exceptions exist, e.g. in ACT-R) 22 Example of a researcher-given task recipe: Brumby et al. 2018 in Computational Interaction, OUP
  23. 23. Video: Distract-r Dario Salvucci
  24. 24. 2. Simplifying computational principles of human mind “Via evolution the brain has achieved a remarkable ability to solve complex problems quickly and energy- efficiently by means of simplified processing principles, imposing its own rules on it, and using its past experiences.” Computational Rationality I – Antti Oulasvirta March 12, 2018 24
  25. 25. Example: Time-to-contact estimation 25 Rushton & Wann 1999 Nature
  26. 26. Example: Time-to-contact estimation Time-to-collision can be estimated with a simple formula from retinal input 26 Rushton & Wann 1999 Nature cal evidence2 for theearly combination of sizeand disparity motion signals(dq/dt + da/dt), and neurophysiological evidence for thecombination of opticsizeand disparity(q + a) at an early stageof visual processing15. A TTC estimatecan bebased on a ratio of thesecombined inputs: TTCdipole = (q + a) / (dq/dt + da/dt) (4) Weadopt thelabel ‘dipole’ from theoryon textureperception16 .A singlepoint viewedbytwoeyesspecifiesabinocular dipole,andtwo points(such astwooppositeedgesof an object) viewedbyoneeye specifyamonocular dipole. Our model sumsdipolesand doesnot distinguish their origin.Analternativemeansof estimatingtheratio in Equation 4isto takethechangein thesummed dipolelength withinalogarithmiccoordinatesystem:TTCdipole= d[ln (q + a)]/dt. Theretinoto approaches( theearlier ar formableobje tion 4isequi TTCdipole = ( Hencewhen distance(I = weightingof betoward an toward TTCd matereliesu changeisbe respon thresho biasesTemporal error with looming TTC plateau is 750 ms
  27. 27. Example: Motor control with muscle synergies Instead of coordinating muscles separately, we learn to control muscle groups. This collapses the problem to a lower-dimension 27 Cheung et al. 2012 PNAS Ting & McKay 2007 Cur op Neur Biol
  28. 28. Challenge #1 for computational rationality Information processing views do not describe the adaptive properties of mind. The agent either fails or succeeds in achieving a goal but does not adapt or reorganize itself accordingly without explicit instruction to do so Computational Rationality I – Antti Oulasvirta March 12, 2018 28
  29. 29. Human mind is rational Human mind is computational Human mind is rational Human mind is computationally adaptive Bounded agents Reinforcement learning Human mind is adaptive Lecture outline
  30. 30. Rational analysis and utility maximization view of human mind The mind is adapted to its environment. Thus, to understand cognition, we need to study the utility/reward structure of tasks and environment: 1. Goals: Specify precisely the goals of the cognitive system 2. Environment: Develop a formal model of the environment to which the system is adapted 3. Optimization: Derive the optimal behavioral function given 1-3 above 30 Wikipedia Long history in economics and psychology
  31. 31. Satisficing and bounded rationality Computational Rationality I – Antti Oulasvirta March 12, 2018 31 People were found to be “suboptimal” in many tasks Herbert Simon’s satisficing
  32. 32. History of bounded rationality RobotEconomics
  33. 33. Are people “intuitive statisticians”? People were found not to follow Bayesian decision theory in verbally given statistical reasoning tasks, showing neglects and fallacies Led to proposal of informal heuristics and biases as decision- making principles
  34. 34. Challenge #2 for computational rationality If brain is adapted to compute rationally with bounded resources, reasoning fallacies follow naturally from optimizations “Optimal behavior” does not mean our lay notion of optimality. Behavior is optimal in light of organismic objectives and external environment Computational Rationality I – Antti Oulasvirta March 12, 2018 34
  35. 35. Human mind is adaptive Human mind is computational Human mind is rational Human mind is computationally adaptive Bounded agents Reinforcement learning Human mind is adaptive Lecture outline
  36. 36. Humans show tremendous capability to adapt and optimize behavior Perception Attention Procedural memory (e.g., bicycling) Episodic memory (memory for events) Declarative memory (memory for facts)
  37. 37. Find the Weather icon: +
  38. 38. Bayesian brain hypothesis The brain operates in “situations of uncertainty in a fashion that is close to the optimal prescribed by Bayesian statistics” Demonstrated e.g. in • Psychophysics • Perception • Attention • Motor control Computational Rationality I – Antti Oulasvirta March 12, 2018 38 Example: The brain is claimed to use Bayes rule to derive optimal timing decisions based on compromised visual information
  39. 39. Example: Visual statistical learning 39 adapted to prior Gaze distribution on a novel page is driven by expectations of locations based on previous pages
  40. 40. Example: Ecological accounts of adaptive nature of long-term memory Schooler andAnderson(1989) memory…isadaptedtoneedsprobability…
  41. 41. Challenge #3 for computational rationality In many sensori-motor-cognitive tasks, the brain shows Bayesian-like abilities, being able to predict under uncertainty and “repair missing data”. The brain adapts to experienced contingencies in the world Computational Rationality I – Antti Oulasvirta March 12, 2018 41
  42. 42. Computational rationality Gershman, Horvitz, & Tenembaum (2015) Science Human mind is computational Human mind is rational Human mind is computationally adaptive Bounded agents Reinforcement learning Human mind is adaptive Lecture outline
  43. 43. Where are we? Thus far: Research predating computational rationality has shown computational principles for: • Rational decision-making and reasoning • Adaptive cognitive and sensorimotor abilities Computational rationality brings these together to simulate how intelligent agents can reconfigure their behavior flexibly in complex real-world problems
  44. 44. Definition of computational rationality “Computing with representations, algorithms, and architectures designed to approximate decisions with the highest expected utility, while taking into account the costs of computation.” Computational Rationality I – Antti Oulasvirta March 12, 2018 44 Gershman, Horvitz, & Tenembaum (2015) Science Models build on “inferential processes for perceiving, predicting, and reasoning under uncertainty”
  45. 45. Three central themes 1. Maximization of expected utility (MEU) as a general purpose ideal for decision-making under uncertainty 2. Approximating MEU is necessary, because estimation of MEU is non-trivial for most real-world problems 3. The choice of how to approximate it is itself a decision subject to utility calculus Breakthroughs started to emerge after probabilistic graphical models...
  46. 46. 1. When to stop computing? Estimating time-critical losses with continuing computation Computational Rationality I – Antti Oulasvirta March 12, 2018 46
  47. 47. 2. Resource-constrained sampling
  48. 48. 3. Trade-off among cognitive systems Computational Rationality I – Antti Oulasvirta March 12, 2018 48
  49. 49. Problem: One-shot concept learning Computational Rationality I – Antti Oulasvirta March 12, 2018 49 Lake et al. (2015) Science ”The model represents concepts as simple programs that best explain observed examples under a Bayesian criterion. On a challenging one-shot classification task, the model achieves human-level performance while outperforming recent deep learning approaches. We also present several “visual Turing tests” probing the model’s creative generalization abilities, which in many cases are indistinguishable from human behavior.”
  50. 50. 4. Probabilistic program induction Computational Rationality I – Antti Oulasvirta March 12, 2018 50 Lake et al. (2015) Science
  51. 51. Summary Recent breakthroughs have found new ways to approximate MEU e.g. in reinforcement learning and by using probabilistic graphical models But these do not sum up to a unified view of CR. What we have is a loose goal and a set of principles... Computational Rationality I – Antti Oulasvirta March 12, 2018 51
  52. 52. Bounded agents Lewis, Howes, Singh 2014 Topics in Cog Sci Human mind is computational Human mind is rational Human mind is computationally adaptive Bounded agents Reinforcement learning Human mind is adaptive Lecture outline
  53. 53. Emergence of adaptive behavior “Interaction emerges in a system consisting of rewards and costs (or utilities), actions, and constraints (e.g., structure of environment). Adaptation is exhibited in different strategies for using a computer.” 53 Howes et al. 2009; Payne & Howes 2013 Capacities Utilities Ecology Space of possible behaviors Space of reasonable behaviors Optimal behavior
  54. 54. Overview • Assume that users behave (approximately) to maximize utility given limits on their own capacity • Optimality bounded by (1) the environment; (2) utility; and (3) the user’s capabilities • People are “bounded agents” • Optimal behavioral strategies can be estimated using e.g. reinforcement learning • No need for hard-wiring task procedures (cf. “old cognitive models”)
  55. 55. Key assumptions Bounded optimality: Cognitive mechanisms adapt not only to the structure but to the human mind/brain itself. Theories of computational rationality are optimal program problems Computational Rationality I – Antti Oulasvirta March 12, 2018 55
  56. 56. Definitions Bounded agent is a machine M with • OM, a space of possible observations • AM, a space of possible actions • PM, a space of programs that can run on the machine Choosing a program specifies an agent model <M,p>. Behavior is a history of alternating observations and actions: An agent is bounded when its behaviors exhibit a subset of all possible behaviors
  57. 57. Bounded optimal programs Machine M can be any cognitive or neural model Utility function gauges goals, tasks, subjective utility Bounded optimality (Russell & Subramanian 1995): Set of optimal programs for a machine: Expectation over a distribution of environments Expectation over histories in an environment
  58. 58. Remarks The cost of finding the optimal program is different than the cost executing it are different The optimality of a program is not the same as the optimality of behavior Multiple levels of optimality explanations can be identified: • Ecological optimality • Bounded optimality • Ecological-bounded optimality Computational Rationality I – Antti Oulasvirta March 12, 2018 58
  59. 59. Remark about “human-level” performance claims in AI The Atari game-playing DL agent is not solving the same problem as humans when they play games. Different observations & actions -> Different bounded programs Computational Rationality I – Antti Oulasvirta March 12, 2018 59
  60. 60. Examples of bounded agents
  61. 61. Three bounded agents in HCI tasks A visual search agent (Jokinen et al. Proc. CHI 2017) • Solves a sampling problem: where to gaze when searching a UI • The optimal bounded program is a strategy for recruiting its own capabilities to optimally sample the display A text entry agent (Sarcar et al. IEEE Pervasive Computing 2018) • Solves a sampling and control problem: where to gaze and where to move the fingers when entering text A button-pressing agent (Oulasvirta et al. Proc. CHI 2018) • solves how to control muscles to press buttons in order to improve its own precision in activating it in time • Optimal bounded program is an intrinsic probabilistic model that tells which muscle signal to send for desired effects (button activation, temporal precision, muscular effort)
  62. 62. 1. Visual sampling Predicts visual search behavior after a layout has changed Jokinen et al. Proc. CHI 2017 visual search model predictsvisual search times for new and changed layouts. For a noviceuser without any prior exposureto thelayout, edicts that of the three elements chosen for this comparison, the salient green element is the fastest to find. After learning the locations of the expert model finds all fairly quickly. At this point, oneblue element and the green element changeplace. Search times for themoved arelonger than for thegreen element, becausethemodel remembers thedistinctivefeaturesof thelatter. Figure 2. On the basis of expected utility, the controller requests atten- tion deployment to a new visual element from theeye-movement system. This directs attention to the most salient unattended visible object and Encod the tar jects, i holds contro to one the pro in the feature where the siz visual a = 0. and 0. On the given top-do an obj other o of the
  63. 63. 1. Visual sampling Utility learning Jokinen et al. Proc. CHI 2017 Figure 2. On the basis of expected utility, the controller requests atten- Encoding an object allows the model to decide whether it is the target or a distractor. Before themodel can encode any ob- jects, it needs to attend one. Thefeature-guidance component holds a visual representation of the environment, and at the controller’srequest it resolvestherequest to deploy attention to oneof theobjectsin it. Theattended target isdetermined by the properties of thevisual objects. Their properties’ presence in the visual representation is based on their eccentricity. A feature isvisually represented if its angular size islarger than ae2 − be, (1) where eistheeccentricity of theobject (in thesame units as the size) and a and b are free parameters that depend on the visual featurein question. Their values, from theliterature, are a = 0.104 and b = 0.85 for colour, 0.14 and 0.96 for shape, and 0.142 and 0.96 for size [35]. On thebasis of therepresented visual features, each object is given a total activation as a weighted sum of bottom-up and top-down activations. Bottom-up activation isthesaliency of an object, calculated asthe dissimilarity of its features v to all other objects of theenvironment, weighted by thesquare root of the linear distance d between the objects: objects features
  64. 64. Results: example Effects of layout change on visual sampling strategy and therefore search costs
  65. 65. 2. Ability-based optimization of text entry Design a text entry that allows a user to reach maximum performance / minimize errors given his/her abilities 65+ users: 7 wpm with touchscreen devices
  66. 66. Touch-WLM 66 Visuomotor strategies Modeling sensorimotor performance in text entry
  67. 67. Model parameters represent idiosyncratic and strategic differences
  68. 68. Design space 68
  69. 69. Optimized designs 69 Baseline Tremor Dyslexia Significant improvements to typing speed
  70. 70. 3. How does the brain achieve control ...of a button? Oulasvirta et al. Proc. CHI 2018
  71. 71. What happens during a button press?
  72. 72. The problem posed to the brain Pressing a button requires careful timing and proportioning of force. The brain should be able to predict how to press a new button and, if it fails, how to repair A DOF problem + A prediction problem NEUROMECHANIC (written in SMALL CAPS to distinguish from neuromechanics, thetheory) isacomputational implementa- tion of these ideas. It can be used asa modeling workbench for comparing button designs. Its predictions approach an upper limit bounded by neural, physical, and physiological factors. Simulating presseswith arangeof button types(linear, tactile, touch, mid-air), we find evidence for the optimality assumption. Wereport simulation resultsfor (1) displacement– velocity patterns, (2) temporal precision and success rate in button activation, and (3) use of force, comparing with effects reported in empirical studies [7, 33, 37, 40, 42, 47, 48, 53, 59, 61]. We show how the objective function can be tuned to simulate a user prioritizing different task goals, such as activation success, temporal precision, or ergonomics. Whilethe model isan order of magnitude morecomplex than thefamiliar approaches, it bears an important benefit: parame- ter settings arerobust over arangeof phenomena. Thesimula- tionswerecarried out by changing physically andanatomically determined parameters, whilekeeping other model parameters fixed without fitting them to human data. We discuss future work to extend theapproach to morecomplex domains. PRELIMINARIES: PARAMETERS OF BUTTON DESIGN We introduce key properties of three main types of buttons: physical, touch, and mid-air. This serves as background for mechanical modeling of buttons in NEUROMECHANIC. We herefocus on design parameters and postpone discussion of empirical findings on button-pressing to Simulations. For thepurposes of this paper, wedefineabutton asan elec- tromechanical device that makes or breaks a signal when pushed, then returns to itsinitial (or re-pushable) statewhen released. It converts a continuous mechanical motion into a discrete electric signal. Physical keyswitches and touch sen- sors are common in modern systems. Physical dimensions (width, slant, and key depth), materials (e.g., plastics), and TACTILE PUSH-BUTTONS Tactileand “clicky” buttonsoffer more points of interest (POIs), or changes during press-down and release. F(B) is called actuation force, which is considered the most important design parameter. dF(B − C)/F(B) is called snap ratio and determines the intensity of tactile feel- ing or ’bump’ of a button. A snap ratio greater than 40% is recommended for astrong tactilefeeling by rubber-domeman- ufacturing companies. Most POIsaretunable, yet somepoints are dependent on other points. With some tactile buttons, a distinct audible “click” sound may be generated, often near the snap or makepoints. TOUCH BUTTONS Touch buttonscan beconsidered azero-travel button. Consequently, they show lower peak force than physi- cal buttons do. Because of false activations, thefinger cannot rest on the surface. Activation is triggered by thresholding contact area of the pulp of thefinger on thesurface. MID-AIR BUTTONS Mid-air buttonsarebased not on electrome- chanical sensing but, for example, on computer vision or elec- tromyographic sensing. Sincethey arecontactless, they do not have a force curve. The point of activation is determined by reference to angle at joint or distance traveled by the fingertip. Latency and inaccuracies in tracking are known issues with mid-air buttons. Figure2. Idealized force–displacement curvesfor linear (left) and tactile (right) buttons. Green lines are press and blue lines are release curves. Annotations (A–H) arecovered in thetext. 2 But buttons are black boxes! Force-displacement curves of two buttons
  73. 73. Neuromechanics: Predictive control of a black box 7312.3.2018 “THE BLACK BOX”
  74. 74. Neuromechanics modeling Intrinsic probabilistic model attempts to take over control of its own sensations when pressing a button Figure4. NEUROM ECHANI C isa computational model of neuromechanicsin button-pressing. It implementsaprobabilistic internal model (Gaussian process regression) that attempts to minimize error between its expected and perceived button activation. Its motor commands are transferred via a noisy and delayed neural channel to muscles controlling the finger. A physical simulation of thefinger acting on thebutton yields four types of sensory signals that areintegrated into a singlepercept (p-center) by meansof a maximum likelihood estimator. Oulasvirta et al. Proc. CHI 2018
  75. 75. Elements of the approach Probabilistic internal model (Bayesian optimization using GP) Perceptual control (Predicting the felt consequences of movement) Neural transmission and muscle activation (Noisy signals) Movement dynamics (Mechanics modeling) Multiple noisy sensory signals (Noisy signals) Probabilistic cue integration (Maximum likelihood estimator)
  76. 76. Let’s look inside the box BO). Variables tual objective d is a random process (GP) aps q and pce umed to have bution of the GPmodel, ob- and a point is quisition func- mand from the onvergence to loration slows g the globally ous system is
  77. 77. Perceptual control of button activation information iscompromised. Figure3. Perceptual control of a button: themotor system hasno access to the true moment of activation, but it can try to reduce error between themoment it expected versusit perceives. Left: perceptual control fails. Right: precise control. Oulasvirta et al. Proc. CHI 2018
  78. 78. Oulasvirta et al. Proc. CHI 2018 Neuromechanics modeling Figure4. NEUROM ECHANI C isa computational model of neuromechanicsin button-pressing. It implementsaprobabilistic internal model (Gaussian process regression) that attempts to minimize error between its expected and perceived button activation. Its motor commands are transferred via a noisy and delayed neural channel to muscles controlling the finger. A physical simulation of thefinger acting on thebutton yields four types of sensory signals that areintegrated into a singlepercept (p-center) by meansof a maximum likelihood estimator. NEUROMECHANIC: A COMPUTATIONAL MODEL NEUROMECHANIC implements these ideascomputationally. It consists of two connected sub-models (Figure 4). Objective Function A motor command q sent to the finger muscles consists of three parameters: of p-Centers nnected to four extero- oprioception, audition, oduces ap-center pci. aneural signal evoked ceptors. We are espe- tors on the finger pad abutton press. Slowly to coarse spatial struc- surfaceof thebutton), ond to motion. Kim als from the fingertip d jerk from the finger and indentation have correlates highly with use buttons havelittle odel to mechanorecep- ime-varying signal is sitivecomponents. In for estimating pco isaweighted average [16, 17]: pco = Â i wi pci where wi = 1/ s 2 i Âi 1/ s 2 i (7) with wi being theweight given to theith single-cue estimate and s 2 i being that estimate’s variance. Figure 6 shows ex- emplary p-center calculations: signal-specific (pci) and inte- grated p-centers (pco) from 100 simulated runs of NEUROME CHANIC pressing a tactile button. Note that absolute differ- ences among pci do not affect pco, only signal variances do The integrated timing estimate isrobust to long delays in, say auditory or visual feedback. This assumption is based on a study showingthat physiological eventsthat takeplacequickly within a few hundred milliseconds, do not tend to be cause over- nor underestimations of event durations [14]. IMPLEMENTATION AND PARAMETER SELECTION NEUROMECHANIC is implemented in MATLAB, using BAYESOPT for Bayesian optimization (GP model uses the ARD Matern 5/2 kernel), SIMSCAPE for mechanics, and nicsin button-pressing. It implementsaprobabilistic internal model (Gaussian d and perceived button activation. Its motor commands are transferred via a ysical simulation of thefinger acting on thebutton yields four typesof sensory maximum likelihood estimator. Objective Function A motor command q sent to the finger muscles consists of threeparameters: q = { µA+ ,t A+ ,sA+ } (1) pressing. It implementsaprobabilistic internal model (Gaussian d button activation. Its motor commands are transferred via a on of thefinger acting on thebutton yieldsfour typesof sensory ihood estimator. ve Function or command q sent to the finger muscles consists of arameters: q = { µA+ ,tA+ ,sA+ } (1) gnal offset µ, signal amplitudet , and duration s of the (A+) muscle. Wehaveset physiologically plausible a(min and max) for theactivation parameters. ectiveisto determine q that minimizes error: min q EP(q) + EA(q) + EC(q) (2) EP is error in predicting perception, EA is error in ac- thebutton, and EC iserror in making contact (button touched). Weassumethat activation and contact errors
  79. 79. ParametersTable 1. Model parameters. Button parameters here given for physical buttons. Task parameters (e.g., finger starting height) are given in text. f denotes function Variable Description Value, Unit Ref. fr Radius of finger cone 7.0 mm fw Length of finger 60 mm r f Density of finger 985 kg/m3 cf Damping of finger pulp 1.5 N·s/m [64] kf Stiffness of finger pulp f , N/m [65] wb Width of key cap 14 mm db Depth of key cap 10 mm r b Density of key cap 700 kg/m3 cb Damping of button 0.1 N·s/m ks Elasticity of muscle 0.8·PCSA [38] kd Elasticity of muscle 0.1·ks [38] kc Damping of muscle 6 N·s/m [38] PCSA Phys. cross-sectional area 4 cm2 L0ag, L0an Initial muscle length 300 mm sn Neuromuscular noise 5·10− 2 sm Mechanoreception noise 1·10− 8 s p Proprioception noise 8·10− 7 sa Sound and audition noise 5·10− 4 sv Display and vision noise 2·10− 2 Figure 7. Data collection on press kinematics: A single-sub High-fidelity optical motion tracking was used to track a m the finger nail. A custom-made single-button setup was cre switches and key capsfrom commercial keyboards. SIMULATIONS: COMPARING BUTTON DESIGNS We investigated NEUROMECHANIC in a series of sim addressing four button types: tactile, linear, touch, an The tactile button type is one of the most commo in commercial keyboards. The linear type is a cha case, because theonly difference isthe ’tactile bump buttons, on theother hand, arecommon and generall ered worse than physical button. Mid-air buttons, on hand, lack mechanoreceptive feedback entirely and proprioceptivefeedback. We inspect predictions for displacement–velocity force–displacement curves, muscle forces, as wel level measures (perceptual error and button activation Except for neural noise parameters, all parameters are physically measurable or known. Button-pressing behavior emerges
  80. 80. Example result: Force-velocity curves omics. complex than nefit: parame- a. Thesimula- anatomically el parameters discuss future omains. DESIGN es of buttons: ckground for CHANIC. We discussion of ions. on asan elec- signal when e) state when motion into a nd touch sen- l dimensions plastics), and cal buttons do. Because of false activations, thefinger cannot rest on the surface. Activation is triggered by thresholding contact area of thepulp of thefinger on the surface. MID-AIR BUTTONS Mid-air buttons arebased not on electrome- chanical sensing but, for example, on computer vision or elec- tromyographic sensing. Sincethey arecontactless, they do not have aforce curve. The point of activation is determined by reference to angle at joint or distance traveled by thefingertip. Latency and inaccuracies in tracking are known issues with mid-air buttons. Figure2. Idealized force–displacement curvesfor linear (left) and tactile (right) buttons. Green lines are press and blue lines are release curves. Annotations (A–H) arecovered in the text. 2 LINEAR ysical n text. Ref. [64] [65] Figure 7. Data collection on press kinematics: A single-subject study. High-fidelity optical motion tracking was used to track a marker on Oulasvirta et al. Proc. CHI 2018
  81. 81. Emulating a light touch Figure 11. Predicted muscle force–displacement behavior for a tactile typebutton: without and with an effort-minimizing term in theobjective function. task performance (perform clude, that although much w support the’optimal black analysescould done, such a feedback, oscillation of th or the effects that impairm FUTURE WORK Modeling latent neural and poses a scientific challeng noise parameters has alarg dynamics downstream. Ho be activated with arbitrary sensory noise parameters t theorder of 1.5·10− 6 s. O prevent NEUROMECHANIC pushing the button with unrealis- tically high force, which would in reality cause fatigue and stress, weintroduceacontrollable ergonomics(or effort) term to theobjective. Adding tuning factors, theobjectivebecomes: min q wEPEP(q) + wEA EA(q) + wEC EC(q) + wFM FM(q) (4) where FM is muscle force expenditure from the Hill muscle model (seebelow) and wi aretuning factors. By changing the weights, themodel can simulate, for example, auser trading off effort versustemporal precision, or auser not caring about temporal precision but only about activating thebutton. 4 Oulasvirta et al. Proc. CHI 2018
  82. 82. Comparison among button types Peak muscle forces 1.7-2.0N for humans Model: 1.4-1.6N Mid-air buttons the worst Confirmed by model movement control. In NEUROMECHANIC, the trade-off between force-use and temporal precision in the objectivefunction is controlled by the tuning factor wFM . When wFM is set to zero, the peak muscle forces for a tactile button increases to 2.45 N. The muscle force–displacement responsespredicted by themodel Table2. Simulation results four button types Linear Tactile Touch Mid-air Perceptual error 47 ms 40 ms 34 ms 178 ms Std of perc. error 31 ms 26 ms 76 ms 47 ms Std of activation time 52 ms 43 ms 90 ms 51 ms Activation success 92% 82% 94% 54% Peak muscle force 1.65 N 1.41 N 2.6 N 2.9 N
  83. 83. Why are mid-air buttons so unusable? Oulasvirta et al. Proc. CHI 2018
  84. 84. Downstream effects of design and system properties 84 Figure4. NEUROM ECHANI C isa computational model of neuromechanicsin button-pressing. It implementsaprobabilistic internal model (Gaussian process regression) that attempts to minimize error between its expected and perceived button activation. Its motor commands are transferred via a noisy and delayed neural channel to muscles controlling the finger. A physical simulation of thefinger acting on thebutton yields four types of sensory signals that areintegrated into a singlepercept (p-center) by meansof a maximum likelihood estimator. NEUROMECHANIC: A COMPUTATIONAL MODEL NEUROMECHANIC implements these ideascomputationally. It consists of two connected sub-models (Figure 4). Objective Function A motor command q sent to the finger muscles consists of three parameters: Mid-air buttons are worse because of the downstream effects of less reliable sensory feedback
  85. 85. Discussion Pros: modelling human behavior using computational rationality • Changes the modeling problem to the definition of observations, actions, bounds, and optimality principle • An order of magnitude fewer parameters (cf. good-old cognitive models) • Behavioral strategies “emerge” Challenges: • What is the right bounded problem (observations, actions)? • What are the right bounds? • What is the optimization mechanism?
  86. 86. Reinforcement learning Gershman & Daw (2017) Annual Review of Psychology Human mind is computational Human mind is rational Human mind is computationally adaptive Bounded agents Reinforcement learning Human mind is adaptive Lecture outline
  87. 87. This part: Revisiting RL from the perspective of neurosciences “Reinforcement learning (RL) is the process by which organisms learn by trial and error to predict and acquire reward.” Requirement: Brains must solve reinforcement learning style problems somehow, as evidenced by their impressive behavioural performance Hard: Curse of dimensionality is compounded by sequential dependency of actions and long-term effects on future reward.
  88. 88. Dyan & Niv 2008
  89. 89. Operant conditioning Skinner box
  90. 90. Model-free learning • Model-free learning (e.g., TD) easier to execute as long-run values are already computed and only need to be compared. • Adaptive but less appropriate for changing environments. Fails in latent learning, with distal changes in rewards • Finding: A procedural learning system in striatum • The firing rate of dopamine neurons in the ventral tegmental area (VTA) and substantia nigra (SNc) appear to mimic the error function in the algorithm. (Schultz et al. 1997 Science) • Unconscious and cognitively impenetrable (Pessiglione et al. 2008 Neuron) • Ventral striatum corresponds to “critic” and dorsal to “actor” (O’Doherty et al. 2004 Science)
  91. 91. Striatum
  92. 92. Schultz’ 1997 experiment Computational Rationality I – Antti Oulasvirta March 12, 2018 92
  93. 93. Tolman’s cognitive maps Computational Rationality I – Antti Oulasvirta March 12, 2018 93 Latent learning experiment
  94. 94. Model-based learning Model-based RL solves the latent learning problem: first learning the environment and then the rewards. Associated to hippocampus in the brain responsible for episodic and spatial memories. This discovery led to rejection of model-free RL as the sole account of RL Computational Rationality I – Antti Oulasvirta March 12, 2018 94
  95. 95. Integrated models proposed Computational Rationality I – Antti Oulasvirta March 12, 2018 95 Lee et al. (2014) Neuron Signatures of both types of learning have been found in neuroscientific studies
  96. 96. Recognized shortcomings Scaling up to real-world tasks: Laboratory tasks small and somewhat artificial • A handful of states and actions • Tasks designed to satisfy the Markov conditional independence property • Real-world situations offer plenty of extraneous detail that are too vast and impoverished to serve as states in RL • States look similar to each other Computational Rationality I – Antti Oulasvirta March 12, 2018 96
  97. 97. Tip: Status of RL in neurosci The good The bad but tractable The ugly: crucial challenges Computational Rationality I – Antti Oulasvirta March 12, 2018 97
  98. 98. One line of works extends to other types of human memory systems... Based on Larry Squire’s taxonomy 1987
  99. 99. Example application of MDP 99
  100. 100. Model of menu search (Chen et al. CHI’15) Finds optimal gaze pattern given menu design and parameters of the visual and cognitive system 100
  101. 101. Inverse Computational Rationality Kangasrääsiö et al. 2017 Proc. CHI
  102. 102. 102 12.3.2018 Why did the user click here?
  103. 103. “Algorithmic Sherlock Holmes”
  104. 104. Forward vs. inverse modeling From model to data (forward) -- from data to model (inverse) 104
  105. 105. Role of inverse modeling for CR Theory-formation • CR models need fit with increasingly more important and realistic datasets (behavioral, neural, cognitive) Application: “Why did the user click this”?  A million dollar question for Internet-based industries CR models may disentangle the causes of observed behavior 1. Teleological explanations (goals) 2. Capacity explanations (cognitive mechanisms) 3. Ecological explanations (structure of tasks and designs)
  106. 106. Alas: Inverse modeling with human data is hard Multiple explanations to any observation • Different observations can be produced by same mechanism Stochasticity Sparse data Large individual and contextual variability 106Kangasrääsiö et al. Proc. CHI 2107
  107. 107. ABC is a principled way to find optimal model parameters Figure 1. This paper studies methodology for inference of parameter values of cognitive models from observational data in HCI. At the bot- tom of the figure, we have behavioral data (orange histograms), such as task solution, only the objecti straints of thesituation, weca theoptimal behavior policy. H that isinferring theconstraints optimal, isexceedingly difficu quality and granularity of pre this inversereinforcement lear to beunreasonable when often data exists, such as isoften the Our application case is a rece [13]. The model studied here tation of search behavior, and completion times, in varioussi parametric assumptions about visual system (e.g., fixation dur Kangasrääsiö et al. Proc. CHI 2107
  108. 108. How ABC works 1. Choose parameter values for the model 2. Simulate predictions 3. Evaluate discrepancy between predictions and observations 4. Use a probabilistic model to estimate discrepancy in different regions of parameter space 5. (Repeat until converged) 108Kangasrääsiö et al. Proc. CHI 2107
  109. 109. How ABC works 109 Approximate Bayesian Computation (ABC) Kangasrääsiö et al. Proc. CHI 2107
  110. 110. How ABC works 110 Approximate Bayesian Computation (ABC) Kangasrääsiö et al. Proc. CHI 2107
  111. 111. How ABC works 111 Approximate Bayesian Computation (ABC) Kangasrääsiö et al. Proc. CHI 2107
  112. 112. How ABC works 112 Approximate Bayesian Computation (ABC) Kangasrääsiö et al. Proc. CHI 2107
  113. 113. How ABC works 113 Approximate Bayesian Computation (ABC) Kangasrääsiö et al. Proc. CHI 2107
  114. 114. How ABC works 114 Approximate Bayesian Computation (ABC) Indicates most likely value and uncertainty Kangasrääsiö et al. Proc. CHI 2107
  115. 115. Uses of ABC Optimal selection and calibration of model for data 1. Model selection (trying out different models) 2. Parameter inference (choosing best parameters) 3. Posterior inference (understanding the space of plausible explanations) 115Kangasrääsiö et al. Proc. CHI 2107
  116. 116. Case: Menu interaction Given click times only, predict parameters of HVS 116 See: Kangasrääsiö et al. CHI 2107 Click times
  117. 117. Posterior estimation ABC yields a posterior distribution for the parameters 117
  118. 118. Explaining individual differences 118
  119. 119. Summary
  120. 120. Computational rationality is the study of computational principles of intelligence in living and artificial beings. It looks at intelligence as rational behavior
  121. 121. Main points Rational + computational + adaptive = Computational rationality The study of computational principles the mind uses to adapt CR unique allows to both generate and infer adaptive behavior in complex tasks Hard, because 1. the involved computational problems are high-dimensional 2. humans are complex and partially impenetrable 3. Theories must be plausible neurally and cognitively
  122. 122. An exciting hotspot for attacking problems at the intersection of AI, ML, cognitive science, and robotics Computational rationality directly touches on some of the hardest problems in psychology and philosophy of mind: • Connectionist vs. symbolic accounts of mind • Nature vs. nuture debate • Strong vs. weak AI and the possibility of general AI • The roles of consciousness and emotions Enough exciting topics for several careers... Computational Rationality I – Antti Oulasvirta March 12, 2018 122

×