Your SlideShare is downloading. ×

My Research Defense

2,245

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
2,245
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
23
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide














































  • Transcript

    • 1. Mental Workload in Multi-Device Personal Information Management Manas Tungare Advisory Committee: Dr. Manuel Pérez-Quiñones Dr. Stephen H. Edwards Dr. Edward A. Fox Prof. Steve Harrison Dr. Tonya Smith-Jackson Thursday, February 12, 2009
    • 2. Talk outline 0 ~45 min 90 min Presentation & questions Additional comments, suggestions OK to record audio? Your questions/comments are welcome at any time. Thursday, February 12, 2009
    • 3. Problem statement & Research questions Thursday, February 12, 2009
    • 4. Personal information, Multiple devices Thursday, February 12, 2009
    • 5. State of the art • Difficult to maintain files on 2+ machines • Workaround: USB drives, email-to-self • Multiple paper calendars are difficult to read • Workaround: Online calendars • Hard to enter phone numbers on phone • Workaround: Sticky notes Thursday, February 12, 2009
    • 6. General Hypothesis • PIM strategies may result in high workload • leading to increased perception of task difficulty • Alternate strategies may lead to lower workload Thursday, February 12, 2009
    • 7. Mental workload issues • What is the mental workload incurred by users when they are trying to use multiple devices for personal information management? • For those tasks that users have indicated are frustrating for them, do the alternate strategies result in lower mental workload? • Are multi-dimensional subjective workload assessment techniques (such as NASA TLX) an accurate indicator of operator performance in information ecosystems? Thursday, February 12, 2009
    • 8. Mental workload • [...] “That portion of an operator’s limited capacity actually required to perform a particular task.” [O’Donnell and Eggemeier, 1986] • Low to moderate levels of workload are associated with acceptable levels of operator performance [Wilson and Eggemeier, 2006] • Measured using subjective measures or physiological measures Thursday, February 12, 2009
    • 9. Research Question 1 • RQ: Do alternate strategies impose different levels of mental workload? • Hypothesis: Alternate strategies lead to lower mental workload than the standard strategies • Experiment: Compare mental workload for tasks identified as difficult, and for their respective workarounds Thursday, February 12, 2009
    • 10. Research Question 2 • RQ: Are subjective assessments of mental workload an accurate indicator of operator performance in this domain? • Hypothesis: Mental workload measured by NASA TLX can be used to predict operator performance • Experiment: (Attempt to) correlate workload assessments with operator performance Thursday, February 12, 2009
    • 11. Research Question 3 • RQ: Are both, subjective measures of workload (TLX) and physiological measure (pupil diameter), sensitive to PIM tasks? • What can we learn from changes in pupil diameter in relation to sub-task boundaries? Thursday, February 12, 2009
    • 12. Experiment design Thursday, February 12, 2009
    • 13. Survey • N ⊂ 220 • Responses to free-form questions in survey • 5 tag types defined a priori: • Devices, tasks, problems, solutions, results • Tags based on emergent codes • Device=laptop, desktop • Problem=syncFailed, conflictingEdits Thursday, February 12, 2009
    • 14. Experiment design • Within subjects (repeated measures) in two sessions 2 weeks apart to minimize learning effects) • Complete block design • Two-factor (task, level of system support) • 6 treatments: 3 tasks ⨉ 2 levels of system support • Counterbalanced to minimize order effects Thursday, February 12, 2009
    • 15. Overview Files Calendar Contacts Participant Code: Date: Treatment: Session: W T 2009S January 5 to January 11, 2009 January 2009 February M TW T F S S MT F S Home Calendar 1 2 3 4 1 5 6 7 8 9 10 11 2 3 4 5 6 7 8 Week 1 12 13 14 15 16 17 18 9 10 11 12 13 14 15 January 2009 19 20 21 22 23 24 25 16 17 18 19 20 21 22 26 27 28 29 30 31 23 24 25 26 27 28 PIM Study - Home Monday 5 Tuesday 6 Wednesday 7 Thursday 8 Friday 9 Saturday 10 Sunday 11 8 AM 9 AM 10 AM Participant Code: Date: Treatment: Session: W T 2009S January 5 to January 11, 2009 January 2009 February 11 AM M TW T F S S MT F S Home Calendar 1 2 3 4 1 5 6 7 8 9 10 11 2 3 4 5 6 7 8 Week 1 NOON 12 13 14 15 16 17 18 9 10 11 12 13 14 15 Level 0 January 2009 19 20 21 22 23 24 25 16 17 18 19 20 21 22 26 27 28 29 30 31 23 24 25 26 27 28 1 PM PIM Study - Home 2 PM Team Outing Monday 5 Tuesday 6 Wednesday 7 Thursday 8 Friday 9 Saturday 10 Sunday 11 3 PM 8 AM 4 PM Dentist's appoint! 9 AM ment 5 PM 10 AM 6 PM Michael's Little League game (tenta! 11 AM tive; confirm with 7 PM Alex) NOON 8 PM 1 PM 9 PM 2 PM Team Outing Page 1/1 3 PM 4 PM Dentist's appoint! ment 5 PM 6 PM Michael's Little League game (tenta! tive; confirm with 7 PM Alex) 8 PM 9 PM Page 1/1 Multiple paper No support for No support for calendars synchronization file migration Level 1 System supports Devices support Online calendars file migration synchronization Thursday, February 12, 2009
    • 16. Sample size estimation • After first 8 participants Task Cohen’s d Sample size estimate Files d = 0.671 n = 9.778 Calendar d = 0.528 n = 15.098 Contacts d = 0.536 n = 14.672 All tasks d = 0.602 n = 11.861 • Effect sizes = Medium to High for Overall Workload • Goal for sample size is 20 Thursday, February 12, 2009
    • 17. Participants • Knowledge workers recruited via email, flyers, personal contacts and promises of pizza • Experienced in laptop & phone use • N=11 18-21 6 Male, 22-25 26-30 5 Female 31-35 0 1 2 3 4 Thursday, February 12, 2009
    • 18. Eye tracker Desktop Instructions Display Eye tracker recorder Laptop Photo credit: Ramanujam Parthasarathy Thursday, February 12, 2009
    • 19. Task familiarization • 6 videos were made, related to tasks • each between 2–6 minutes long • 10 familiarization tasks required to be performed before experimental tasks • Watch videos Thursday, February 12, 2009
    • 20. Files task • Start on desktop • Set of instructions to edit specific files • Then move to laptop, edit more files • Move back to desktop • L0: using USB drives, email-to-self • L1: using a Network drive Thursday, February 12, 2009
    • 21. Task instructions Thursday, February 12, 2009
    • 22. Calendar task • Set of instructions to create, replace, update, delete calendar entries • “Today is …” • Questions on availability and schedule • L0: Paper calendars, home and work • L1: Online calendars, home and work Thursday, February 12, 2009
    • 23. Task instructions Thursday, February 12, 2009
    • 24. Contacts task • Set of instructions to create, replace, update, delete contact records • “You may/may not use your phone/laptop” • L0: phone + laptop, no sync support • L1: phone + laptop, with sync support Thursday, February 12, 2009
    • 25. Task instructions Thursday, February 12, 2009
    • 26. Measures • Time on task • captured by app that displays instructions • Task performance metrics (vary by task) • NASA TLX • Pupillometric data from eye tracker Thursday, February 12, 2009
    • 27. Why NASA TLX • Higher correlation with performance (concurrent validity) as compared to SWAT and WP [Rubio & Díaz, 2004] • Validated in several environments since 1988 [several, 1988-present] • Sensitive to some differences not discriminated by SWAT [Battiste 1988] • Highest sensitivity among 4 scales [Hill 1989] Thursday, February 12, 2009
    • 28. Pupillometric data • Pupil diameter can be used as an estimate of mental workload [Beatty 1982] • Task-Evoked Pupillary Response (TERP) • Physiological measure (not subjective) • Continuous measure (unlike TLX) • Post-processing is required Thursday, February 12, 2009
    • 29. Analysis Thursday, February 12, 2009
    • 30. RQ1: Workload at L0 & L1 • Effort is significantly lower at α=0.05 for L1 than for L0 (ANOVA) for N=8 Mean L0 Mean L1 p value MD: Mental Demand 48.9 40 0.1878 PD: Physical Demand 35.3 33 0.7271 TD: Temporal Demand 40.3 30.5 0.1197 OP: Own Performance 27.6 17.8 0.0604 ✓ EF: Effort 51.1 35.5 0.0382 FR: Frustration 38.2 25.8 0.0564 OW: Overall Workload 41.4 31.4 0.0666 Thursday, February 12, 2009
    • 31. Time on task L0 L1 p value Mean (SD) Mean (SD) Files 2663 (802) 2309 (601) 0.394 Calendar 2754 (1677) 1786 (1077) 0.226 Contacts 2558 (1368) 1832 (1478) 0.377 Thursday, February 12, 2009
    • 32. RQ2: Performance predictor • TLX OW not found to correlate highly with time on task • Pearson’s r: Workload ~ Time on Task • r = 0.188 for Files • r = –0.014 for Calendars • r = 0.031 for Contacts Thursday, February 12, 2009
    • 33. RQ2: Performance predictor Pearson’s r Files Calendar Contacts MD: Mental Demand 0.271 –0.171 0.087 PD: Physical Demand 0.140 0.190 –0.226 TD: Temporal Demand 0.095 0.074 –0.254 OP: Own Performance 0.288 0.036 –0.086 EF: Effort 0.196 0.016 0.227 FR: Frustration 0.393 0.135 0.083 OW: Overall Workload 0.188 0.014 –0.031 • Further analysis at step-level, not task-level Thursday, February 12, 2009
    • 34. Time on task: Files 400 L0 Move from Desktop Move from Laptop to Laptop to Desktop L1 p = 0.1624 p = 0.1577 300 Time Taken (s) 200 ! ! 100 ! ! ! ! ! ! ! ! 0 1 2 3 4 5 6 7 8 9 10 Step # Thursday, February 12, 2009
    • 35. Time on task: Calendars 100 L0 ! Data lookup steps L1 80 ! ! ! ! 60 Time Taken (s) ! 40 ! ! 20 ! ! ! Data entry steps ! ! ! ! 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Step # Thursday, February 12, 2009
    • 36. Time on task: Contacts 200 L0 L1 150 ! Time Taken (s) ! 100 ! 50 ! ! ! 0 1 2 3 4 5 6 Step # Thursday, February 12, 2009
    • 37. RQ3: TLX & Pupillometric • Analyzing pupillometric data • 100,000 data points per session @ 30 Hz • Need to filter blinks • Establish baseline; compute relative changes • Signal smoothing techniques Thursday, February 12, 2009
    • 38. Initial results from pupillometric data 120 100 Pupil Radius (eye image pixels) 80 60 40 S0 S1 S3 S2 S4 S5 S6 S7 S8 S9 S10 20 0 200 400 600 Thursday, February 12, 2009 Time Elapsed (seconds)
    • 39. Initial results from pupillometric data 120 100 Pupil Radius (eye image pixels) 80 60 40 S0 S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 20 0 200 400 600 800 1000 Thursday, February 12, 2009 Time Elapsed (seconds)
    • 40. Expected outcomes (“So what?”) Thursday, February 12, 2009
    • 41. Alternate strategies • Lower mental workload ✓ • Lower time on task ✓ • Synced phones have more data entered ✓ • (Slightly) fewer errors Thursday, February 12, 2009
    • 42. Observations • Online calendars provided a frame of reference at all times (highlighted day) • A few chose not to sync calendars (in L1) • None prepared for the transition until asked to switch machines • The step after “move now” takes a lot of time — participants don’t realize they’re missing information until they need it Thursday, February 12, 2009
    • 43. Identify critical sub-tasks • Files: Time-on-task was a highly discriminative measure for the sub-task of moving from one machine to another • Pupillometric measure appears sensitive to changes in workload across sub-tasks • Calendar task: paper was faster for data entry, online was faster for lookup ★ Optimize selectively, remove bottlenecks Thursday, February 12, 2009
    • 44. Cross-task measure • TLX can be used to study PIM tasks • E.g. which of browsing or searching leads to higher workload? • E.g. does Tool A lead to lower workload than Tool B? Thursday, February 12, 2009
    • 45. Studying multiple devices • Study each one individually? • What happens at the transition … • ... Thursday, February 12, 2009
    • 46. Questions & comments ? ! Note to self: Turn off audio recording before committee deliberation. Thursday, February 12, 2009
    • 47. Questions & comments ? ! Thank you! Note to self: Turn off audio recording before committee deliberation. Thursday, February 12, 2009

    ×