CHI 2014 talk by Antti Oulasvirta: Automated Nonlinear Regression Modeling for HCI

1,029 views

Published on

Automated Nonlinear Regression Modeling for HCI

CHI 2014 talk by Antti Oulasvirta

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,029
On SlideShare
0
From Embeds
0
Number of Embeds
65
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

CHI 2014 talk by Antti Oulasvirta: Automated Nonlinear Regression Modeling for HCI

  1. 1. Automated Nonlinear Regression Modeling for HCI Antti Oulasvirta Max Planck Institute for Informatics and Saarland University Saarbrücken Germany
  2. 2. I have data!
  3. 3. I have data! I need a model! This Note contributes a method that supports model acquisition in HCI
  4. 4. We focus on nonlinear regression models has to nav patch to an to another, WWW sit engine resu - Pirolli 18 Patch h interface e ch can have different gain rates and can insist on a n-patch cost. gi(tW) is the cumulative gain in patch een spent. In a well-organized interface there is a gi and a quick depletion. A user can define a policy tay within each patch. In this case, i = 1PliTBgi(twi) = TB Âi = 1Pligi(twi), (15) q. 13, we get: R = ligi(twi) 1 + litwi (16) the patch model. The linear case is simple. Figure 12 for different within-patch times. Cases where gi are proached by Charnov’s marginal value theorem, which ger should stay in a patch as long as the slope of gi is verage gain rate R for the environment. Linear menus; User Performance; Mathematical models; Visual search. ACM Classification Keywords H.5.2. Information Interfaces and Presentation Miscellaneous INTRODUCTION Hick-hyman law T = blog2(n + 1) Fitts’ law T = a + blog2( 2A W ) T = blog2(n + 1) (1) w T = a + blog2( 2A W ) (2) aw of learning T = aPb + c (3) ’ power law appropriate copyright statement here. ACM now supports three different statements: pyright: ACM holds the copyright on the work. This is the historical ap- more straightforw Our goal is to ad models of linear top applications mathematical mo sist of time spen We assume that a number of strat ceptual/motor ta large [23, 40] b space. 1) Direct quired through p programme sacc target location, o first menu item a target is located would combine e reliability of ava search might be to the reliability T = a + blog2( 2A W ) Power law of learning T = aPb + c Stevens’ power law Paste the appropriate copyright statement here. ACM now supports thre copyright statements: • ACM copyright: ACM holds the copyright on the work. This is the his proach. • License: The author(s) retain copyright, but ACM receives an exclusive p license. • Open Access: The author(s) wish to pay for the work to be open access. tional fee must be paid to ACM. This text field is large enough to hold the appropriate release statement ass single spaced. Pointing Foraging Choice Learning
  5. 5. We focus on nonlinear regression models has to nav patch to an to another, WWW sit engine resu - Pirolli 18 Patch h interface e ch can have different gain rates and can insist on a n-patch cost. gi(tW) is the cumulative gain in patch een spent. In a well-organized interface there is a gi and a quick depletion. A user can define a policy tay within each patch. In this case, i = 1PliTBgi(twi) = TB Âi = 1Pligi(twi), (15) q. 13, we get: R = ligi(twi) 1 + litwi (16) the patch model. The linear case is simple. Figure 12 for different within-patch times. Cases where gi are proached by Charnov’s marginal value theorem, which ger should stay in a patch as long as the slope of gi is verage gain rate R for the environment. Linear menus; User Performance; Mathematical models; Visual search. ACM Classification Keywords H.5.2. Information Interfaces and Presentation Miscellaneous INTRODUCTION Hick-hyman law T = blog2(n + 1) Fitts’ law T = a + blog2( 2A W ) T = blog2(n + 1) (1) w T = a + blog2( 2A W ) (2) aw of learning T = aPb + c (3) ’ power law appropriate copyright statement here. ACM now supports three different statements: pyright: ACM holds the copyright on the work. This is the historical ap- more straightforw Our goal is to ad models of linear top applications mathematical mo sist of time spen We assume that a number of strat ceptual/motor ta large [23, 40] b space. 1) Direct quired through p programme sacc target location, o first menu item a target is located would combine e reliability of ava search might be to the reliability T = a + blog2( 2A W ) Power law of learning T = aPb + c Stevens’ power law Paste the appropriate copyright statement here. ACM now supports thre copyright statements: • ACM copyright: ACM holds the copyright on the work. This is the his proach. • License: The author(s) retain copyright, but ACM receives an exclusive p license. • Open Access: The author(s) wish to pay for the work to be open access. tional fee must be paid to ACM. This text field is large enough to hold the appropriate release statement ass single spaced. Pointing Foraging Choice Learning “White box” Efficient
  6. 6. We focus on nonlinear regression models has to nav patch to an to another, WWW sit engine resu - Pirolli 18 Patch h interface e ch can have different gain rates and can insist on a n-patch cost. gi(tW) is the cumulative gain in patch een spent. In a well-organized interface there is a gi and a quick depletion. A user can define a policy tay within each patch. In this case, i = 1PliTBgi(twi) = TB Âi = 1Pligi(twi), (15) q. 13, we get: R = ligi(twi) 1 + litwi (16) the patch model. The linear case is simple. Figure 12 for different within-patch times. Cases where gi are proached by Charnov’s marginal value theorem, which ger should stay in a patch as long as the slope of gi is verage gain rate R for the environment. Linear menus; User Performance; Mathematical models; Visual search. ACM Classification Keywords H.5.2. Information Interfaces and Presentation Miscellaneous INTRODUCTION Hick-hyman law T = blog2(n + 1) Fitts’ law T = a + blog2( 2A W ) T = blog2(n + 1) (1) w T = a + blog2( 2A W ) (2) aw of learning T = aPb + c (3) ’ power law appropriate copyright statement here. ACM now supports three different statements: pyright: ACM holds the copyright on the work. This is the historical ap- more straightforw Our goal is to ad models of linear top applications mathematical mo sist of time spen We assume that a number of strat ceptual/motor ta large [23, 40] b space. 1) Direct quired through p programme sacc target location, o first menu item a target is located would combine e reliability of ava search might be to the reliability T = a + blog2( 2A W ) Power law of learning T = aPb + c Stevens’ power law Paste the appropriate copyright statement here. ACM now supports thre copyright statements: • ACM copyright: ACM holds the copyright on the work. This is the his proach. • License: The author(s) retain copyright, but ACM receives an exclusive p license. • Open Access: The author(s) wish to pay for the work to be open access. tional fee must be paid to ACM. This text field is large enough to hold the appropriate release statement ass single spaced. Pointing Foraging Choice Learning “White box” Efficient Applications in HCI 1.Engineering models 2.Adaptive interfaces 3.Interface optimization
  7. 7. We focus on nonlinear regression models has to nav patch to an to another, WWW sit engine resu - Pirolli 18 Patch h interface e ch can have different gain rates and can insist on a n-patch cost. gi(tW) is the cumulative gain in patch een spent. In a well-organized interface there is a gi and a quick depletion. A user can define a policy tay within each patch. In this case, i = 1PliTBgi(twi) = TB Âi = 1Pligi(twi), (15) q. 13, we get: R = ligi(twi) 1 + litwi (16) the patch model. The linear case is simple. Figure 12 for different within-patch times. Cases where gi are proached by Charnov’s marginal value theorem, which ger should stay in a patch as long as the slope of gi is verage gain rate R for the environment. Linear menus; User Performance; Mathematical models; Visual search. ACM Classification Keywords H.5.2. Information Interfaces and Presentation Miscellaneous INTRODUCTION Hick-hyman law T = blog2(n + 1) Fitts’ law T = a + blog2( 2A W ) T = blog2(n + 1) (1) w T = a + blog2( 2A W ) (2) aw of learning T = aPb + c (3) ’ power law appropriate copyright statement here. ACM now supports three different statements: pyright: ACM holds the copyright on the work. This is the historical ap- more straightforw Our goal is to ad models of linear top applications mathematical mo sist of time spen We assume that a number of strat ceptual/motor ta large [23, 40] b space. 1) Direct quired through p programme sacc target location, o first menu item a target is located would combine e reliability of ava search might be to the reliability T = a + blog2( 2A W ) Power law of learning T = aPb + c Stevens’ power law Paste the appropriate copyright statement here. ACM now supports thre copyright statements: • ACM copyright: ACM holds the copyright on the work. This is the his proach. • License: The author(s) retain copyright, but ACM receives an exclusive p license. • Open Access: The author(s) wish to pay for the work to be open access. tional fee must be paid to ACM. This text field is large enough to hold the appropriate release statement ass single spaced. Pointing Foraging Choice Learning But hard to acquire! “White box” Efficient Applications in HCI 1.Engineering models 2.Adaptive interfaces 3.Interface optimization
  8. 8. Current tools offer poor support Equation Evaluation
  9. 9. Exploration is inefficient and laborious The set of all possible models defined by your task Unexplored model space
  10. 10. Exploration is inefficient and laborious The set of all possible models defined by your task Unexplored model space
  11. 11. We propose automated model search Best models Automated model search Dataset It builds on work in symbolic programming [6,15] Constraints
  12. 12. We propose automated model search Best models Automated model search Dataset It builds on work in symbolic programming [6,15] Generate Test Constraints
  13. 13. Iterative search in a model space y, X = {x1, ..., xm} Dependent variable Predictor variables y = β1f1(X) + ... + βnfn(X) Winner
  14. 14. Iterative search in a model space y, X = {x1, ..., xm} Dependent variable Predictor variables y = β1x1 + ... + βnxm Start y = β1f1(X) + ... + βnfn(X) Winner
  15. 15. Iterative search in a model space y, X = {x1, ..., xm} Dependent variable Predictor variables y = β1x1 + ... + βnxm Start y = β1f1(X) + ... + βnfn(X) Winner y = β1(xl ¤ xk) + ... + βnxm Transform/ Drop Iterate Fitness function
  16. 16. Iterative search in a model space y, X = {x1, ..., xm} Dependent variable Predictor variables y = β1x1 + ... + βnxm Start y = β1f1(X) + ... + βnfn(X) Winner y = β1(xl ¤ xk) + ... + βnxm Transform/ Drop Iterate Fitness function Algebraic Exponential Logarithmic Trigonometric Presently 16 transformations
  17. 17. Stochastic search method The set of all possible models defined by your task
  18. 18. Stochastic search method The set of all possible models defined by your task
  19. 19. Command line operation
  20. 20. Command line operation Dataset
  21. 21. Command line operation Dataset
  22. 22. Command line operation Dataset
  23. 23. Multiple controls offered Your model space
  24. 24. Multiple controls offered Max. number of free parameters Transformations:Types, Number per term Seed equation Constraints to the model space Your model space
  25. 25. Multiple controls offered Max. number of free parameters Transformations:Types, Number per term Seed equation Constraints to the model space Stochasticity Fitness function (e.g., R2,AIC, BIC) Search process Local search depth Your model space
  26. 26. Does it work??
  27. 27. Case 1. Comparison with 11 existing models in literature
  28. 28. Case 1. Comparison with 11 existing models in literature Mouse pointing Two-thumb tapping ... Menu selection D,W ID, Telapsed B,I,D,W,Fr
  29. 29. Case 1. Comparison with 11 existing models in literature Mouse pointing Two-thumb tapping ... Menu selection D,W ID, Telapsed B,I,D,W,Fr More predictors, observations, model terms
  30. 30. Case 1. Comparison with 11 existing models in literature Mouse pointing Two-thumb tapping ... Menu selection D,W ID, Telapsed B,I,D,W,Fr Improvements to fitness found in 7 out of 11 cases. Comparable model fitness in others. More predictors, observations, model terms
  31. 31. Baseline This paper # Dataset Predictors⇤ n k Model provided in paper R2 ⇤⇤ Best model found⇤⇤⇤ R2 1 Stylus tapping (1 oz)[8] A,W 16 2 a + b log2(2A/W) .966 a + b log2(A/W) .966 2 Reanalyzed data [8] A,We a + b log2(A/We + 1) .987 a + b(log2(log2 A) We) .981 3 Mouse pointing [8] A,W 16 2 a + b log2(A/W + 1) .984 a + b log2(A/W) .973 4 A,We a + b log2(A/We + 1) .980 a + b log10(A/We) .978 5 Trackball dragging [8] A,W 16 2 a + b log2(A/W + 1) .965 a + b log2(A (W3)4) .981 6 A,We a + b log2(A/We + 1) .817 a + b(A/(1 elog10 We )) .941 7 Magic lens pointing [13] A,W, S 16 3 a + b log2(D/S + 1) + c log2(S/2/A) .88 a + b(1 1/A) + cW9 .947 8 Tactile guidance [7] N,I,D 16 3 Eq. 8-9, nonlinear .91, .95 Nonlinear (k = 3) .980 9 Pointing, angular [3] Exp. 2 W, H, ↵, A 310 4 Eq. 33, IDpr, nonlinear .953 Nonlinear (k = 4) .962 10 Two thumb tapping[11] ID,Telapsed 20 6 Eq. 5-6, quadratic .79 a + b(T2 elapsed/ID) .929 11 Menu selection[2] B,I,D,W,Fr 10 6 Eq. 1-7, nonlinear .99,.52 Nonlinear (k = 6) .990 Table 1. Benchmarking automatic modeling against previously published models of response time in HCI. Notes: n = Number of observations (data rows); k = Number of free parameters; * All variable names from the original papers, except I is interface type (dummy coded); ** = As reported in the paper; *** = Some equations omitted due to space restrictions to fixed terms. A second is deciding on a meaningful fit- ness score – we currently use R2 , but this can be changed to cross-validation metrics. A third is model diagnostics. For instance, the use of OLS assumes collinearity and homoge- nous error variance [9]. The latter is probably an unrealistic assumption in many HCI datasets. Analytics are needed to examine the consequences. Fourthly, the equations are not 1. Pointing datasets 1–6 provide the least room to improve, since the R2 s are high to begin with. 2. The method is more successful when there are more predic- tors. The improvements obtained for datasets 7–11 range from small (8, 9, and 11) to medium (7) to large (10). Constraining of model exploration See the full table in the paper
  32. 32. Baseline This paper # Dataset Predictors⇤ n k Model provided in paper R2 ⇤⇤ Best model found⇤⇤⇤ R2 1 Stylus tapping (1 oz)[8] A,W 16 2 a + b log2(2A/W) .966 a + b log2(A/W) .966 2 Reanalyzed data [8] A,We a + b log2(A/We + 1) .987 a + b(log2(log2 A) We) .981 3 Mouse pointing [8] A,W 16 2 a + b log2(A/W + 1) .984 a + b log2(A/W) .973 4 A,We a + b log2(A/We + 1) .980 a + b log10(A/We) .978 5 Trackball dragging [8] A,W 16 2 a + b log2(A/W + 1) .965 a + b log2(A (W3)4) .981 6 A,We a + b log2(A/We + 1) .817 a + b(A/(1 elog10 We )) .941 7 Magic lens pointing [13] A,W, S 16 3 a + b log2(D/S + 1) + c log2(S/2/A) .88 a + b(1 1/A) + cW9 .947 8 Tactile guidance [7] N,I,D 16 3 Eq. 8-9, nonlinear .91, .95 Nonlinear (k = 3) .980 9 Pointing, angular [3] Exp. 2 W, H, ↵, A 310 4 Eq. 33, IDpr, nonlinear .953 Nonlinear (k = 4) .962 10 Two thumb tapping[11] ID,Telapsed 20 6 Eq. 5-6, quadratic .79 a + b(T2 elapsed/ID) .929 11 Menu selection[2] B,I,D,W,Fr 10 6 Eq. 1-7, nonlinear .99,.52 Nonlinear (k = 6) .990 Table 1. Benchmarking automatic modeling against previously published models of response time in HCI. Notes: n = Number of observations (data rows); k = Number of free parameters; * All variable names from the original papers, except I is interface type (dummy coded); ** = As reported in the paper; *** = Some equations omitted due to space restrictions to fixed terms. A second is deciding on a meaningful fit- ness score – we currently use R2 , but this can be changed to cross-validation metrics. A third is model diagnostics. For instance, the use of OLS assumes collinearity and homoge- nous error variance [9]. The latter is probably an unrealistic assumption in many HCI datasets. Analytics are needed to examine the consequences. Fourthly, the equations are not 1. Pointing datasets 1–6 provide the least room to improve, since the R2 s are high to begin with. 2. The method is more successful when there are more predic- tors. The improvements obtained for datasets 7–11 range from small (8, 9, and 11) to medium (7) to large (10). Constraining of model exploration See the full table in the paper Baseline This paper in paper R2 ⇤⇤ Best model found⇤⇤⇤ W) .966 a + b log2(A/W) . We + 1) .987 a + b(log2(log2 A) We) . W + 1) .984 a + b log2(A/W) . We + 1) .980 a + b log10(A/We) . W + 1) .965 a + b log2(A (W3)4) . We + 1) .817 a + b(A/(1 elog10 We )) . + 1) + c log2(S/2/A) .88 a + b(1 1/A) + cW9 . ar .91, .95 Nonlinear (k = 3) . onlinear .953 Nonlinear (k = 4) . c .79 a + b(T2 elapsed/ID) . ar .99,.52 Nonlinear (k = 6) . ls of response time in HCI. Notes: n = Number of observations (dat
  33. 33. But my data is more complex!
  34. 34. Case 2: Complex dataset Multitouch-rotation data n of parameters: Angle, shown in figure below). display with tablet in Po- Dependent variable: MT Predictors: Angle, Diameter, X position,Y position, Direction [Hoggan et al. Proc. CHI’13]
  35. 35. Case 2: Complex dataset Multitouch-rotation data n of parameters: Angle, shown in figure below). display with tablet in Po- Dependent variable: MT Predictors: Angle, Diameter, X position,Y position, Direction [Hoggan et al. Proc. CHI’13] and R2 = 0.835. However, the method also foun with seven free parameters and R2 = 0.827. Also, model, with four parameters and R2 = 0.805, was a + bx1 + c cos x3 2 e cos 1 x2 0 log10(x1⇥x3) + d tan x3 Here, variables x0, ..., x3 refer to x-position, y-po gle, and diameter, respectively. Further analysis i R2=0.805
  36. 36. But the models don’t make sense!
  37. 37. Case 3:Theoretically motivated operations Dataset Model
  38. 38. Case 3:Theoretically motivated operations Dataset Model type (dummy coded); ** = As reported in provide the least room to improve, to begin with. ccessful when there are more predic- ts obtained for datasets 7–11 range 1) to medium (7) to large (10). exploration ted modeling, we took Dataset 11 mations (1/x, log2(x), ⇤, /, +, ) to e original paper. Many models were ee parameters and R2 = 0.90. tasets with a single model eling multiple datasets with a single pointing papers, the model terms are ameters fitted per dataset. We tested covering three datasets (1, 3, and 5) Theoretically motivated operations
  39. 39. But I need a model for MANY datasets!
  40. 40. Case 4: Multiple datasets, one model Dataset 3 Dataset 1 Dataset 2 Model
  41. 41. Conclusion & Discussion • Proof-of-concept • Model identification by defining constraints • Supports different modeling tasks in HCI • Promising results • Limitations and open questions • E.g., assumptions of nonlinear modeling (see paper) • “Brute force” approach • Warning against “fishing”! • Future work: performance and expressive controls
  42. 42. Project homepage (code forthcoming!) http://www.mpi-inf.mpg.de/~oantti/nonlinearmodeling/ Acknowledgements: This research was funded by the Max Planck Centre for Visual Computing and Communication and the Cluster of Excellence on Multimodal Computing and Interaction at Saarland University. antti.oulasvirta@aalto.fi Automated Nonlinear Regression Modeling for HCI Take-away: •Model identification by constraint definition

×