Biases in Human Estimation of Interruptibility: Effects and Implications for Practice Daniel Avrahami, James Fogarty & Sco...
 
Introduction <ul><li>Estimating someone else’s interruptibility is something we do every day </li></ul><ul><ul><li>At home...
Estimating Interruptibility <ul><li>1 2 3 4 5 </li></ul><ul><li>Highly Interruptible Highly Non-Interruptible </li></ul>30...
Estimating Interruptibility <ul><li>1 2 3 4 5 </li></ul><ul><li>Highly Interruptible Highly Non-Interruptible </li></ul>
Prompting for Self-Report <ul><li>1 2 3 4 5 </li></ul><ul><li>Highly Interruptible Highly Non-Interruptible </li></ul>
Prompting for Self-Report <ul><li>1 2 3 4 5 </li></ul><ul><li>Highly Interruptible Highly Non-Interruptible </li></ul>
And the answer is… <ul><li>1 2 3 4 5 </li></ul><ul><li>Highly Interruptible Highly Non-Interruptible </li></ul>
And the answer is… <ul><li>1 2 3 4 5 </li></ul><ul><li>Highly Interruptible Highly Non-Interruptible </li></ul>3
Goals <ul><li>A better understanding on the biases in estimating others’ interruptibility can inform the design of CMC and...
Research Questions <ul><li>Which contextual cues (e.g., working on the computer) affect the error in human estimation of a...
Talk Outline <ul><li>Study Design </li></ul><ul><li>Measures and Analysis </li></ul><ul><li>Results </li></ul><ul><li>Conc...
Study Design
Study Design <ul><li>Compare Self-Reports of Interruptibility  ( Reported )  with Estimations of Interruptibility  ( Estim...
Reporters <ul><li>Four high-level staff members (3 females, 1 male) </li></ul><ul><li>Audio and video recordings in their ...
Estimators <ul><li>40 subjects (Online recruitment, majority were students) </li></ul><ul><li>Watched 15- or 30-second cli...
Estimators <ul><li>40 subjects (Online recruitment, majority were students) </li></ul><ul><li>Watched 15- or 30-second cli...
Estimations vs. Self-Reports <ul><li>Tested the relationship between Estimations and Reports: </li></ul><ul><li>Estimated ...
Measures and Analysis
Contextual Cues <ul><li>Coded by six paid coders </li></ul><ul><li>For each 15 seconds segment, coded for a large set of c...
Contextual Cues (cont.) Phone Social Engagement Computer Desk Papers File Cabinet Food Writing Door is Closed Drink Standi...
“ Estimation Error” <ul><li>Estimation Error  =  Reported – Estimated </li></ul>Reported 4 3 1 2 5 2 Estimated 1 3 4 5
“ Estimation Error” <ul><li>Estimation Error  =  Reported – Estimated </li></ul>4 3 3 2 2 1 1 1 1 2 -1 -1 -1 -1 -2 -2 -2 -...
“ Estimation Error” <ul><li>Estimation Error  =  Reported – Estimated </li></ul>Under-estimation Over-estimation Reported ...
Analysis Approach <ul><li>Step 1: Find  which  cues have an effect on Estimation Error </li></ul><ul><ul><li>Effect on Und...
Analysis Approach <ul><li>Step 1: Find  which  cues have an effect on Estimation Error </li></ul><ul><ul><li>Effect on Und...
A couple of notes on the analysis <ul><li>A self-report determines the possible range of Estimation Errors </li></ul><ul><...
Results
<ul><li>Under-estimated when the reporter was socially engaged </li></ul><ul><li>Over-estimated when the reporter wasn’t s...
<ul><li>Over-estimated reporter’s interruptibility when wasn’t using the phone </li></ul>Phone   ‚  “ Overrating the st...
<ul><li>Greater over-estimation error when the reporter was standing </li></ul><ul><li>Reporter Standing significantly cor...
Computer <ul><li>Estimators more likely to interpret a situation as more interruptible than reported when the Reporter was...
<ul><li>Under-estimated when the door was closed </li></ul><ul><li>Correlation between the state of the door and Reported ...
<ul><li>Estimators assessing Reporters as more interruptible when they were drinking </li></ul><ul><li>Correlation between...
Conclusions and  Future Work
Conclusions <ul><li>Presented results from an in-depth analysis of causes for biases in human estimation of interruptibili...
Conclusions (cont.) <ul><li>Findings suggest that providing too much information may not only be a concern for privacy, bu...
Future Work <ul><li>Examine the effect of other clip-durations </li></ul><ul><li>Examine the effect of degree of familiari...
Acknowledgements <ul><li>Yaakov Kareev </li></ul><ul><li>Darren Gergle </li></ul><ul><li>Laura Dabbish </li></ul><ul><li>J...
Thank you This work was funded in part by  NSF Grants IIS-0121560, IIS-0325351, and by DARPA Contract No. NBCHD030010  for...
FAQ <ul><li>2PT </li></ul><ul><li>Y-SPRTE </li></ul><ul><li>LEN1530 </li></ul>
Why separate over and under? <ul><li>Shouldn’t just use absolute or squared error because over and under estimation will m...
Is the length of the clips reasonable? <ul><li>The information available to Estimators in this study (15/30 second video+a...
Why not use a 2-point scale? <ul><li>With 2 levels, Estimation is either 100% correct,  or 100% incorrect </li></ul><ul><u...
Upcoming SlideShare
Loading in …5
×

CHI'07: Biases in Human Estimation of Interruptibility

2,154 views

Published on

Here are the slides for my CHI 2007 presentation. Enjoy.

Published in: Travel, Business
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,154
On SlideShare
0
From Embeds
0
Number of Embeds
48
Actions
Shares
0
Downloads
68
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

CHI'07: Biases in Human Estimation of Interruptibility

  1. 1. Biases in Human Estimation of Interruptibility: Effects and Implications for Practice Daniel Avrahami, James Fogarty & Scott Hudson
  2. 3. Introduction <ul><li>Estimating someone else’s interruptibility is something we do every day </li></ul><ul><ul><li>At home, at work, at the store </li></ul></ul><ul><li>…but it’s also something we’re not always good at </li></ul><ul><li>This becomes much harder when done over a distance </li></ul><ul><li>Let’s try this together: </li></ul>
  3. 4. Estimating Interruptibility <ul><li>1 2 3 4 5 </li></ul><ul><li>Highly Interruptible Highly Non-Interruptible </li></ul>30 seconds
  4. 5. Estimating Interruptibility <ul><li>1 2 3 4 5 </li></ul><ul><li>Highly Interruptible Highly Non-Interruptible </li></ul>
  5. 6. Prompting for Self-Report <ul><li>1 2 3 4 5 </li></ul><ul><li>Highly Interruptible Highly Non-Interruptible </li></ul>
  6. 7. Prompting for Self-Report <ul><li>1 2 3 4 5 </li></ul><ul><li>Highly Interruptible Highly Non-Interruptible </li></ul>
  7. 8. And the answer is… <ul><li>1 2 3 4 5 </li></ul><ul><li>Highly Interruptible Highly Non-Interruptible </li></ul>
  8. 9. And the answer is… <ul><li>1 2 3 4 5 </li></ul><ul><li>Highly Interruptible Highly Non-Interruptible </li></ul>3
  9. 10. Goals <ul><li>A better understanding on the biases in estimating others’ interruptibility can inform the design of CMC and awareness systems </li></ul><ul><li>Provide an insight into how people are likely to use different pieces of contextual information </li></ul><ul><ul><li>For example, the state of an office door was significantly correlated with errors in estimation </li></ul></ul><ul><li>A lot of previous work on CMC and awareness systems [cf. Fish’90, Mantei’91,Dourish’92, Bly’93, Adler’94, Tang’01] </li></ul>
  10. 11. Research Questions <ul><li>Which contextual cues (e.g., working on the computer) affect the error in human estimation of another person’s interruptibility? </li></ul><ul><li>What is the source of a contextual cue’s effect on the bias in estimation? </li></ul><ul><ul><li>e.g., overrating the strength of a cue </li></ul></ul>
  11. 12. Talk Outline <ul><li>Study Design </li></ul><ul><li>Measures and Analysis </li></ul><ul><li>Results </li></ul><ul><li>Conclusions and Future work </li></ul>Details (f-measures, etc.)
  12. 13. Study Design
  13. 14. Study Design <ul><li>Compare Self-Reports of Interruptibility ( Reported ) with Estimations of Interruptibility ( Estimated ) </li></ul><ul><li>Two groups of participants: </li></ul><ul><ul><li>Reporters (in their natural work environment) </li></ul></ul><ul><ul><li>Estimators (viewing video and audio recordings) </li></ul></ul>
  14. 15. Reporters <ul><li>Four high-level staff members (3 females, 1 male) </li></ul><ul><li>Audio and video recordings in their offices during a month-long period </li></ul><ul><li>Experience-sampling method used to collect self-reports of interruptibility at random intervals </li></ul><ul><ul><li>(30-minutes average) </li></ul></ul><ul><li>672 self-reports and over 600 hours of data </li></ul><ul><li>[Used in Hudson’03, Fogarty’05 for the creation of predictive models] </li></ul>
  15. 16. Estimators <ul><li>40 subjects (Online recruitment, majority were students) </li></ul><ul><li>Watched 15- or 30-second clips of reporters </li></ul><ul><ul><li>Ensured that did not know the Reporters </li></ul></ul><ul><li>Provided estimates of interruptibility </li></ul><ul><li>60 clips each </li></ul><ul><ul><li>Allowed to watch each clip as many times as wanted </li></ul></ul><ul><li>Task lasted about 1 hour </li></ul><ul><li>[Used in Fogarty’05 to compare performance of estimators vs. ML] </li></ul>
  16. 17. Estimators <ul><li>40 subjects (Online recruitment, majority were students) </li></ul><ul><li>Watched 15- or 30-second clips of reporters </li></ul><ul><ul><li>Ensured that did not know the Reporters </li></ul></ul><ul><li>Provided estimates of interruptibility </li></ul><ul><li>60 clips each </li></ul><ul><ul><li>Allowed to watch each clip as many times as wanted </li></ul></ul><ul><li>Task lasted about 1 hour </li></ul><ul><li>[Used in Fogarty’05 to compare performance of estimators vs. ML] </li></ul>
  17. 18. Estimations vs. Self-Reports <ul><li>Tested the relationship between Estimations and Reports: </li></ul><ul><li>Estimated Interruptibility was significantly correlated with Reported Interruptibility (p<.001) </li></ul><ul><ul><li>(This is good). Means that Estimators examined the situation presented to them </li></ul></ul><ul><li>Estimated Interruptibility was significantly different from Reported Interruptibility (p<.001) </li></ul><ul><ul><li>Estimators, on average, estimated Reporters to be more interruptible than reported </li></ul></ul><ul><ul><li>(Two outliers’ data excluded from the remaining analyses) </li></ul></ul>
  18. 19. Measures and Analysis
  19. 20. Contextual Cues <ul><li>Coded by six paid coders </li></ul><ul><li>For each 15 seconds segment, coded for a large set of contextual cues that could be coded reliably </li></ul><ul><ul><li>Reporters activities </li></ul></ul><ul><ul><li>Guest activities </li></ul></ul><ul><ul><li>Environmental cues </li></ul></ul><ul><li>Inter-coder agreement was 93.4% (evaluated on 5% of the data) </li></ul>
  20. 21. Contextual Cues (cont.) Phone Social Engagement Computer Desk Papers File Cabinet Food Writing Door is Closed Drink Standing Present
  21. 22. “ Estimation Error” <ul><li>Estimation Error = Reported – Estimated </li></ul>Reported 4 3 1 2 5 2 Estimated 1 3 4 5
  22. 23. “ Estimation Error” <ul><li>Estimation Error = Reported – Estimated </li></ul>4 3 3 2 2 1 1 1 1 2 -1 -1 -1 -1 -2 -2 -2 -3 -3 -4 0 0 0 0 0 Reported 4 3 1 2 5 2 Estimated 1 3 4 5
  23. 24. “ Estimation Error” <ul><li>Estimation Error = Reported – Estimated </li></ul>Under-estimation Over-estimation Reported 4 3 1 2 5 -1 -1 -1 -1 -2 -2 -2 -3 -3 -4 0 0 0 0 0 4 3 3 2 2 1 1 1 1 2 0 0 0 0 0 2 Estimated 1 3 4 5
  24. 25. Analysis Approach <ul><li>Step 1: Find which cues have an effect on Estimation Error </li></ul><ul><ul><li>Effect on Under-Estimation errors? </li></ul></ul><ul><ul><li>Effect on Over-Estimation errors? </li></ul></ul><ul><li>Step 2: Investigate the cause for a cue’s effect on Estimation Error </li></ul><ul><ul><li>Effect on Reported Interruptibility? </li></ul></ul><ul><ul><li>Effect on Estimated Interruptibility? </li></ul></ul>„ ‚  
  25. 26. Analysis Approach <ul><li>Step 1: Find which cues have an effect on Estimation Error </li></ul><ul><ul><li>Effect on Under-Estimation errors? </li></ul></ul><ul><ul><li>Effect on Over-Estimation errors? </li></ul></ul><ul><li>Step 2: Investigate the cause for a cue’s effect on Estimation Error </li></ul><ul><ul><li>Effect on Reported Interruptibility? </li></ul></ul><ul><ul><li>Effect on Estimated Interruptibility? </li></ul></ul>„ ‚   <ul><li>For example: </li></ul><ul><li>Found that the Reporter using the had a </li></ul><ul><li>Significant effect on Over-Estimation (no significant effect on Under-Estimation) </li></ul><ul><li>Significant effect on Estimated Interruptibility, but… no significant effect on Reported Interruptibility </li></ul><ul><li>“ Considering a cue that is not significant” </li></ul>‚   
  26. 27. A couple of notes on the analysis <ul><li>A self-report determines the possible range of Estimation Errors </li></ul><ul><ul><li>Reported = 1, Error can be -4 … 0 </li></ul></ul><ul><ul><li>Reported = 5, Error can be 0 … 4 </li></ul></ul><ul><ul><li>=> Need to include the Reported Interruptibility as a control measure in the analysis of error </li></ul></ul><ul><li>All done using Mixed Model analysis </li></ul>
  27. 28. Results
  28. 29. <ul><li>Under-estimated when the reporter was socially engaged </li></ul><ul><li>Over-estimated when the reporter wasn’t socially engaged </li></ul>Social Engagement „ ‚   “ Overrating the strength of a cue ”
  29. 30. <ul><li>Over-estimated reporter’s interruptibility when wasn’t using the phone </li></ul>Phone   ‚  “ Overrating the strength of a cue ”
  30. 31. <ul><li>Greater over-estimation error when the reporter was standing </li></ul><ul><li>Reporter Standing significantly correlated with situation more interruptible (both R,E) </li></ul><ul><ul><li>Link between physical transitions and interruptions [Ho’05] </li></ul></ul>Reporter is Standing  ‚   “ Overrating the strength of a cue ”
  31. 32. Computer <ul><li>Estimators more likely to interpret a situation as more interruptible than reported when the Reporter was using the computer </li></ul><ul><ul><li>Link to issues of online presence and availability </li></ul></ul> ‚   “ Considering a cue that is not significant ”
  32. 33. <ul><li>Under-estimated when the door was closed </li></ul><ul><li>Correlation between the state of the door and Reported interruptibility was not significant </li></ul>Door is Closed „ ‚   “ Considering a cue that is not significant ”
  33. 34. <ul><li>Estimators assessing Reporters as more interruptible when they were drinking </li></ul><ul><li>Correlation between drinking and Reported interruptibility was not significant </li></ul>Drink  ‚   “ Considering a cue that is not significant ”
  34. 35. Conclusions and Future Work
  35. 36. Conclusions <ul><li>Presented results from an in-depth analysis of causes for biases in human estimation of interruptibility </li></ul><ul><li>Compared self-reports, collected in the field, and estimations based on audio and video recordings </li></ul>
  36. 37. Conclusions (cont.) <ul><li>Findings suggest that providing too much information may not only be a concern for privacy, but may lead to errors in estimations </li></ul><ul><ul><li>Sharing certain contextual cues will likely result in misinterpretations of a person’s interruptibility </li></ul></ul><ul><li>A new system, informed by our results, could </li></ul><ul><ul><li>Avoid exposing certain cues (or specific levels of cues) </li></ul></ul><ul><ul><li>Enhance (or moderate) others </li></ul></ul>
  37. 38. Future Work <ul><li>Examine the effect of other clip-durations </li></ul><ul><li>Examine the effect of degree of familiarity between reporter and estimator on estimation errors </li></ul><ul><li>Observe reporters in other settings and jobs </li></ul><ul><li>Investigate the use of these findings for effective creation of computer-supported communication and awareness systems </li></ul>
  38. 39. Acknowledgements <ul><li>Yaakov Kareev </li></ul><ul><li>Darren Gergle </li></ul><ul><li>Laura Dabbish </li></ul><ul><li>Joonhwan Lee </li></ul>
  39. 40. Thank you This work was funded in part by NSF Grants IIS-0121560, IIS-0325351, and by DARPA Contract No. NBCHD030010 for more info visit: www.cs.cmu.edu/~nx6 or email: [email_address] [email_address] [email_address]
  40. 41. FAQ <ul><li>2PT </li></ul><ul><li>Y-SPRTE </li></ul><ul><li>LEN1530 </li></ul>
  41. 42. Why separate over and under? <ul><li>Shouldn’t just use absolute or squared error because over and under estimation will make it seem like there is no effect </li></ul><ul><li>Shouldn’t just put all together because errors will cancel each other out </li></ul>back
  42. 43. Is the length of the clips reasonable? <ul><li>The information available to Estimators in this study (15/30 second video+audio clips) was similar to information available to users of media space systems, and far richer than information available in most awareness systems </li></ul>back
  43. 44. Why not use a 2-point scale? <ul><li>With 2 levels, Estimation is either 100% correct, or 100% incorrect </li></ul><ul><ul><li>With 5-levels, we get degrees of error </li></ul></ul><ul><li>Doesn’t make sense asking Reporters and Estimators to make a binary decision when </li></ul><ul><ul><li>Don’t know what interruption is about </li></ul></ul><ul><ul><li>Don’t know whom the interruption is from </li></ul></ul><ul><li>Couldn’t analyze 5-scale data as 2-points </li></ul><ul><ul><li>If Reported =1 and Estimated =4, should we count as correct?? </li></ul></ul>back

×