Introduction to Web Survey Usability Design and Testing

2,669 views
2,497 views

Published on

Amy Anderson Riemer and I taught this shourt cour

Published in: Education, Technology, Business
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,669
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
54
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • Explain lab environment with eye tracker
  • Metadata includes any contextual information that can be relevant to interpret the dataset. Auxiliary data may not be available at a R level. It may only be available at an aggregate level. This type of data can be used to for nonresponse adjustment. It can be used as a benchmark in order to assess the quality of the reported data. It can also be used prior to data collection for statistical adjustments for sampling purposes.
  • Dr. Couper is a research professor at the survey research center at the University of Michigan. His most recent work is focused on web survey design.
  • This is the TSE framework developed by Groves. The circles in the middle represent all of the different sources of error that can occur during the development and administration of a survey. Later in the day we will actually come back to a few of these sources of error to discuss how the different modes of electronic data collection that we have been discussing can effect those errors.
  • Krueter has taken Grove’s framework and identified where in the survey process that paradata can provide information
  • In a responsive design there are process indicators that allow for real time monitoring where decisions can be made by survey managers to make decisions that will allow them to adjust their resources to achieve better outcomes. Managers can use information from early results to help cut costs and achieve higher response rates. Managers can also use information learned from early paradata to augment methods for trying to reach uncooperative respondents. These nonrespondents could be different from your respondents. So by successfully recruiting these, you can possibly reduce nonresponse bias.
  • Researchers are using information learned from paradata about nonrespondents to made nonresponse adjustments.
  • In order to collect information about each contact, contact forms are completed. The CB uses a CHI to do this.
  • Example of options given: advance letter given, scheduled an appointment, left a note/appointment card, offered incentive, checked with neighbors, contacted other family members, contacted property manager, none.
  • Recording the date/time of prior contact allows schedulers to vary contact with the hope of increasing the probability of a successful contact and saving costs.
  • Answering drop outs = answer questions but quit before completing the surveyLurkers = view all of the questions but don’t answer anyLuking drop-outs= vies some of the Qs without responding and quit before completingItem non responding drop outs= view some questions, answer some but not all, break off before completing the survey
  • It is necessary to decide ahead of time which actions are meaningful to record based on the interests of the researchers/managers.These are examples of the meaningful actions that Heerwegh decided to collect.
  • It can even detect when users respond through devices such as book readers or game consoles.
  • An area of growing concern is the use of tablets to answer surveys and the inability of some to use Flash. Paradata can allow survey managers to monitor what devices their population (or similar populations if it is a one-time survey) are using and chose to, for example, redesign questions so that Flash isn’t necessary if tablet usage increases.
  • Paradata will show you all of the answers that a respondent entered into a field. Excessive movement can indicate that
  • This navigational paradata can also be captured on CATI instruments to assess the design of those questionnaires as well.
  • The quality of online questionnaires are now evaluated using this formula. The number of all possible errors is the sum of all potential error messages, potential prompts, and validation messages programmed into the instrument. The number of activated errors is the number that were encountered by the R during the survey
  • Help links… if you expect/want R’s to see your help information more frequently, than you may need to list it on the question screen somehow rather than leaving it as a link.
  • Note that some call centers may attempt to route the caller using menus to different staff based on the issue. For example, there may be a separate number or prompt that will take R’s with technical issues to support staff trained to deal with those issues.
  • Interviewers can have variability in the way they interpret different things about the R or the household that is asked of them in the Contact History Interview.Some studies have shown high missing data for interviewer provided paradata -- CHI information. If the paradata needed is provided by interviewers, it can be incomplete or entirely skipped. Therefore not as reliable.
  • Paradata can create very large files that need to be saved on servers. Client-side paradata that collects such things and mouse/key strokes can be known to be incredibly large. This is party the reason why researchers are encouraged to think about the types of “meaningful” information that is needed ahead of time. Capturing large amounts of paradata in electronic instruments could slow down the performance of the instrument for Rs. Resources must be dedicated to develop the tools to collect the paradata within the system (web, CATI, CAPI). Resources must be dedicated to monitor the paradata in a responsive design and/or to analyze the data post hoc for nonresponse adjustment or changes to the instrument. There can often be massive amounts of paradata to work with and sift through.
  • -Mixed/Multi/Multiple are used interchangeably-There may be other practical or regulatory constraints that also affect the data collection design.
  • Concurrent --
  • Because of the high cost of in-person interviews, survey organizations are using them to either establish their initial contact and ensure lower coverage error (such as in the CPS example) OR they are using it as a NRFU option after other NRFU options have been exhausted.
  • There are some examples were organization are offering mixed modes to save time.TDE is when R’s can just enter data directly over the phone after being prompted by an automatic recording. EDI is the transmission of data between two entities.
  • This is the TSE framework developed by Groves. The circles in the middle represent all of the different sources of error that can occur during the development and administration of a survey. Later in the day we will actually come back to a few of these sources of error to discuss how the different modes of electronic data collection that we have been discussing can effect those errors.
  • Households with no telephones. Cell phone only households. No directory of cell phone numbers.
  • http://www.pewinternet.org/Trend-Data-(Adults)/Internet-Adoption.aspx
  • http://www.internetworldstats.com/stats.htmrprises?
  • Based on demographic characteristics.-This limits the usefulness for data collection by internet onlyPeople share e-mail addresses or have multiple ones. There is no equivalent to the RDD algorithm used to select phone numbers for generating a random sample. Researchers have come rely more of self-selected panels (which aren’t necessarily representative of the general population).
  • This phenomenon is not necessarily seen in establishment surveys.
  • Some of this information is based on de Leeuw’s meta-analysis of 67 articles/papers on mode comparisons.
  • -DeLeeuw’s International Handbook of Survey Methodology – mode effects do exist but tend to be small in well-conducted surveys. The biggest differences are seen with sensitive questions. Interviewer modes produced more socially desirable and less consistent answers, but also more detailed answers to open ended questions.-
  • Two current Ph.D. students at JPSMSome articles were saying offering both options raised response rates, and some were saying that it had an overall decrease in response rates
  • Because there weren’t a large number of studies that were available, they couldn’t identify any study characteristics that would explain why this was going on.Hyp 2: in the process of sorting mail and organizing themselves to go on the Web, R’s could have forgotten or misplaced the survey invitation.Hyp 3: although in the studies they looked at, approximately 80% of R’s that started the Web instrument completed it.
  • Overall response rate from the survey was 62%. The mail preference group had the highest response rate followed by the equal preference group. Dillman felt that although the response rate for the web preference was lower than the other groups, it was a respectable web response rate and he was encouraged by the fact that 41% of respondents could be ‘pushed’ into the web option. This suggested to Dillman that withholding paper could be a tool for surveyors to drive more respondents onto the web.
  • The telephone was not a option to respondents in this survey, so further analysis of this low percentage was not given.
  • Dillman also looked at the relationship between mode preferences and actual mode of response completed. There is a strong relationship between the mode completed and the mode preferred. This was especially interesting because there were treatments in which respondents were ‘pushed’ to respond in one mode versus the other. R’s appeared to prefer the mode they were pushed into, suggesting that ‘pushing’ could work.
  • Dillman then isolated the Web Preference group to see how strongly they preferred reporting via Web. He found a strong relationship between the mode that the respondent reported and their preference for that mode. This suggested that by pushing respondents to a certain mode, not only will they report in that mode, but they will like it!
  • It may seem exciting that you can push people to the web, but through additional Chi square analysis, Dillman looked further and different characteristics of respondents to see what limitations there would be to pushing respondents to the web. Because of coverage issues associated with the web, certain people will no other choice but to respond via mail. Mail respondents are older than web respondents and have lower levels of education and lower income. Mail respondents are also more likely to be female and married. These results show that only certain individuals may be ‘pushed’ successfully to the web. If you are doing a survey without an alternative mode to the web, you could introduce nonresponse bias.
  • There is a condition that doesn’t mention web as an option until much later in the mailouts (Day 23). There is an alternative option that doesn’t mail out a paper until later in the mailouts (Day 23). The other two options are a mix of varying “web intensities”. So there was one that started out with a strong push to the web without mentioning paper. And one that had a strong push towards paper and didn’t mention web until the last mail out. The other two in the middle mentioned web and paper at different times in the middle mailings.
  • Two differences are statistically significant: S and A2 and S and A3.
  • The results showed promise for promoting the web as an alterative strategy earlier. In the A4 sample, R’s didn’t see a paper form until the third contact. The authors also noted that the A2-A4 strategies could also lead to a cost reduction of between 12-20% compared to the cost of the standard strategy.
  • The current ACS has three modes: paper, CATI, and CAPI. In 2011, the Census Bureau conducted two Internet tests to evaluate the feasibility of offering a web mode and to identify the best way to present that mode to promote self-response. These are results from the first of the two tests.
  • Along with the control panel there were Push and Choice panels. Each one tested two different strategies. For the Push panel, they tested how quickly the follow-up (with paper form) should occur. The choice panel tested advertising of the web option in a prominent place or a subtle/inconspicuous place. The Targeted group consisted of areas that contained households that used the Internet at a higher rate. The remaining tracts were in the Not Targeted group.
  • The accelerated Push strategy (on the right) was seen as very successful and the first test where the Census Bureau saw a push strategy perform well in a household survey. It performed better than then the Prominent choice strategy or the control.
  • The accelerated Push strategy continued to show a strong presence in the Not Targeted areas, which was an unexpected finding. Response rates were not significantly different between the control and the choice strategies. Similar to the targeted areas, the push strategy was successful in having most of its responses sent via Internet.
  • This testing of a push strategy on just the form follow-up showed that the letter-only strategy had a higher percentage of web responses than the form and letter response group. There was no affect in overall response rate between the two groups. Similar finding to other studies that the mode offered is the mode preferred.
  • This table represents the various calls that were made to the different treatments offered in the ACS Internet Test. There were five different treatments (control, prominent choice, not prominent choice, push regular, push accelerated). Then among those there were respondents that replied via internet, mail, and were overall nonrespondents. This table shows the number of calls that were attempted and analyzed.
  • The letter was the primary way for communicating the mode choice to Rs. We are always working on the materials that go into the mailing packages to encourage R’s to respond to the survey and to respond in the mode that we want them to if we give them a choice. So this was hopeful information because we always worry that R’s are remembering these pieces of information. But just because they remember it doesn’t mean that they read it.
  • Introduction to Web Survey Usability Design and Testing

    1. 1. Introduction to Web Survey Usability Design and Testing DC-AAPOR Workshop Amy Anderson Riemer Jennifer Romano Bergstrom The views expressed on statistical or methodological issues are those of the presenters and not necessarily those of the U.S. Census Bureau.
    2. 2. Schedule9:00 – 9:15 Introduction & Objectives9:15 – 11:45 Web Survey Design: Desktop & Mobile11:45 – 12:45 Lunch12:45 – 2:30 Assessing Your Survey2:30 – 2:45 Break2:45 – 3:30 Mixed Modes Data Quality3:30 – 4:00 Wrap Up 2
    3. 3. ObjectivesWeb Survey Design: Desktop & Assessing Your SurveyMobile• Paging vs. Scrolling • Paradata• Navigation• Scrolling lists vs. double-banked • Usability response options• Edits & Input fields• Checkboxes & Radio buttons Quality of Mixed Modes• Instructions & Help• Graphics • Mixed Mode Surveys• Emphasizing Text & White Space• Authentication • Response Rates• Progress Indicators • Mode Choice• Consistency 3
    4. 4. Web Survey DesignThe views expressed on statistical or methodological issues are those of the presenters and not necessarily those of the U.S. Census Bureau.
    5. 5. Activity #11. Today’s date2. How long did it took you to get to BLS today?3. What do you think about the BLS entrance? 5
    6. 6. Why is Design Important?• No interviewer present to correct/advise• Visual presentation affects responses – (Couper’s activity)• While the Internet provides many ways to enhance surveys, design tools may be misused 6
    7. 7. Why is Design Important?• Respondents extract meaning from how question and response options are displayed• Design may distract from or interfere with responses• Design may affect data quality 7
    8. 8. Why is Design Important? 8 http://www.cc.gatech.edu/gvu/user_surveys/
    9. 9. Why is Design Important?• Many surveys are long (> 30min)• Long surveys have higher nonresponse rates• Length affects quality Adams & Darwin, 1982; Dillman et al., 1993; 9 Haberlein & Baumgartner, 1978
    10. 10. Why is Design Important?• Respondents are more tech savvy today and use multiple technologies• It is not just about reducing respondent burden and nonresponse• We must increase engagement• High-quality design = trust in the designer Adams & Darwin, 1982; Dillman et al., 1993; 10 Haberlein & Baumgartner, 1978
    11. 11. http://www.pewinternet.org/Static-Pages/Trend- 11Data-(Adults)/Device-Ownership.aspx
    12. 12. http://www.pewinternet.org/Static-Pages/Trend- 12Data-(Adults)/Device-Ownership.aspx
    13. 13. http://www.nielsen.com/content/dam/corporate/us/en/reports-downloads/2012-Reports/Nielsen- 13Multi-Screen-Media-Report-May-2012.pdf
    14. 14. http://www.nielsen.com/content/dam/corporate/us/en/reports-downloads/2012-Reports/Nielsen- 14Multi-Screen-Media-Report-May-2012.pdf
    15. 15. Nielsen: The Cross-Platform Report, Quarter 2, 152012-US
    16. 16. UX Design Failure• Poor planning• “It’s all about me.” (Redish: filing cabinets)• Human cognitive limitations – Memory & Perception – (fun activity time)
    17. 17. UX Design Failure• Poor planning• “It’s all about me.” (Redish: filing cabinets)• Human cognitive limitations – Memory & Perception – (fun activity time) - Primacy - Chunking - Recency - Patterns
    18. 18. Web Survey Design• Paging vs. Scrolling • Graphics• Navigation • Emphasizing Text• Scrolling vs. Double- • White Space Banked • Authentication• Edits and Input Fields • Progress Indicators• Checkboxes and • Consistency Radio Buttons• Instructions and Help 23
    19. 19. Paging vs. ScrollingPaging Scrolling• Multiple questions per page • All on one static page• Complex skip patterns • No data is saved until submitted at end• Not restricted to one item – Can lose all data per screen • Respondent can• Data from each page saved review/change responses – Can be suspended/resumed • Questions can be answered• Order of responding can be out of order controlled • Similar look-and-feel as• Requires more mouse clicks paper 24
    20. 20. Paging vs. Scrolling• Little advantage (breakoffs, nonresponse, time, straightlining) of one over the other• Mixed approach may be best• Choice should be driven by content and target audience – Scrolling for short surveys with few skip patterns; respondent needs to see previous responses – Paging for long surveys with intricate skip patterns; questions should be answered in order Couper, 2001; Gonyea, 2007; Peytchev, 2006; 25 Vehovar, 2000
    21. 21. Web Survey Design• Paging vs. Scrolling • Graphics• Navigation • Emphasizing Text• Scrolling vs. Double- • White Space Banked • Authentication• Edits and Input Fields • Progress Indicators• Checkboxes and • Consistency Radio Buttons• Instructions and Help 26
    22. 22. Navigation• In a paging survey, after entering a response – Proceed to next page – Return to previous page (sometimes) – Quit or stop – Launch separate page with Help, definitions, etc. 27
    23. 23. Navigation: NP• Next should be on the left – Reduces the amount of time to move cursor to primary navigation button – Frequency of use Couper, 2008; Dillman et al., 2009; Faulkner, 28 1998; Koyani et al., 2004; Wroblewski, 2008
    24. 24. Navigation NP Example Peytchev & Peytcheva, 2011 29
    25. 25. Navigation: PN• Previous should be on the left – Web application order – Everyday devices – Logical reading order 30
    26. 26. Navigation PN Example 31
    27. 27. Navigation PN Example 32
    28. 28. Navigation PN Example 33
    29. 29. Navigation PN Example 34
    30. 30. Navigation Usability Study/Experiment Romano & Chen, 2011 35
    31. 31. Method• Lab-based usability study• TA read introduction and left letter on desk• Separate rooms• R read letter and logged in to survey• Think Aloud• Eye Tracking• Satisfaction Questionnaire• Debriefing Romano & Chen, 2011 36
    32. 32. Results: Satisfaction I * p < 0.0001 Romano & Chen, 2011 37
    33. 33. 8.5 Results: Satisfaction II 8.5Mean Satisfaction 8 Mean Satisfaction 8 7.5 Rating Rating 7.5 7 7 6.5 6.5 6 6 Mean N_P PN Mean N_P PN Overall reaction to the survey: Information displayed on the screens: terrible – wonderful. p < 0.05. inadequate – adequate. p = 0.07. 8.5 8.5Mean Satisfaction Mean Satisfaction 8 8 7.5 Rating Rating 7.5 7 7 6.5 6.5 6 6 Mean N_P PN Mean N_P PN Arrangement of information on the screens: Forward navigation: illogical – logical. p = 0.19. impossible – easy. p = 0.13. Romano & Chen, 2011 38
    34. 34. Eye Tracking• Participants looked at Previous and Next in PN conditions• Many participants looked at Previous in the N_P conditions – 39 Couper et al. (2011): Previous gets used more when it is on the right.
    35. 35. N_P vs. PN: Respondent Debriefing• N_P version – Counterintuitive – Don’t like the “buttons being flipped.” – Next on the left is “really irritating.” – Order is “opposite of what most people would design.”• PN version – “Pretty standard, like what you typically see.” – The location is “logical.” Romano & Chen, 2011 40
    36. 36. Navigation Alternative• Previous below Next – Buttons can be closer – But what about older adults? – What about on mobile? Couper et al., 2011; Wroblewski, 2008 41
    37. 37. Navigation Alternative• Previous below Next – Buttons can be closer – But what about older adults? – What about on mobile? Couper et al., 2011; Wroblewski, 2008 42
    38. 38. Navigation Alternative: Large primarynavigation button; secondary smaller 43
    39. 39. Navigation Alternative: No back/previous option 44
    40. 40. Confusing Navigation 45
    41. 41. Web Survey Design• Paging vs. Scrolling • Graphics• Navigation • Emphasizing Text• Scrolling vs. Double- • White Space Banked • Authentication• Edits and Input Fields • Progress Indicators• Checkboxes and • Consistency Radio Buttons• Instructions and Help 46
    42. 42. Long List of Response Options• One column: Scrolling – Visually appear to belong to one group – When there are two columns, 2nd one may not be seen (Smyth et al., 1997)• Two columns: Double banked – No scrolling – See all options at once – Appears shorter 47
    43. 43. 1 Column vs. 2 Column Study Romano & Chen, 2011 48
    44. 44. Seconds to First Fixation252015 first half * p < 0.0110 second half50 2 column 1 column Romano & Chen, 2011 49
    45. 45. Total Number of Fixations4035302520 first half15 second half1050 2 column 1 column Romano & Chen, 2011 50
    46. 46. Time to Complete Item 120 100 80Seconds 60 1 col 2 col 40 20 0 Mean Min Max Romano & Chen, 2011 51
    47. 47. 1 Col. vs. 2 Col.: Debriefing• 25 had a preference – 6 preferred one column • They had received the one-column version – 19 preferred 2 columns • 7 had received the one-column version • Prefer not to scroll • Want to see and compare everything at once • It is easier to “look through,” to scan, to read • Re one column, “How long is this list going to be?” Romano & Chen, 2011 52
    48. 48. Long Lists• Consider breaking list into smaller questions• Consider series of yes/no questions• Use logical order or randomize• If using double-banked, do not separate columns widely 53
    49. 49. Web Survey Design• Paging vs. Scrolling • Graphics• Navigation • Emphasizing Text• Scrolling vs. Double- • White Space Banked • Authentication• Edits and Input Fields • Progress Indicators• Checkboxes and • Consistency Radio Buttons• Instructions and Help 54
    50. 50. Input Fields Activity 55
    51. 51. Input Fields• Smaller text boxes = more restricted• Larger text boxes = less restricted – Encourage longer responses• Visual/Verbal Miscommunication – Visual may indicate “Write a story” – Verbal may indicate “Write a number”• What do you want to allow? 56
    52. 52. Types of Open-Ended Responses• Narrative – E.g., Describe…• Short verbal responses – E.g., What was your occupation?• Single word/phrase responses – E.g., Country of residence• Frequency/Numeric response – E.g., How many times…• Formatted number/verbal – E.g., Telephone number 57
    53. 53. Open-Ended Responses: Narrative• Avoid vertical scrolling when possible• Always avoid horizontal scrolling 58
    54. 54. Open-Ended Responses: Narrative• Avoid vertical scrolling when possible• Always avoid horizontal scrolling 32.8 characters 38.4 characters ~700 Rs Wells et al., 2012 59
    55. 55. Open-Ended Responses: Numeric• Is there a better way? 60
    56. 56. Open-Ended Responses: Numeric• Is there a better way? 61
    57. 57. Open-Ended Responses: Numeric• Use of templates reduces ill-formed responses – E.g., $_________.00 Couper et al., 2009; Fuchs, 2007 62
    58. 58. Open-Ended Responses: Date• Not a good use: intended response will always be the same format• Same for state, zip code, etc.• Note – “Month” = text – “mm/yyyy” = #s 63
    59. 59. Web Survey Design• Paging vs. Scrolling • Graphics• Navigation • Emphasizing Text• Scrolling vs. Double- • White Space Banked • Authentication• Edits and Input Fields • Progress Indicators• Checkboxes and • Consistency Radio Buttons• Instructions and Help 64
    60. 60. Check Boxes and Radio Buttons• Perceived Affordances• Design according to existing conventions and expectations• What are the conventions? 65
    61. 61. Check Boxes: Select all that apply 66
    62. 62. Check Boxes in drop-down menus 67
    63. 63. Radio Buttons: Select only one 68
    64. 64. Radio Buttons: Select only one 69
    65. 65. Radio Buttons: In grids 70
    66. 66. Radio Buttons on mobile• Would something else be better? 71
    67. 67. Reducing Options• What is necessary? 72
    68. 68. Web Survey Design• Paging vs. Scrolling • Graphics• Navigation • Emphasizing Text• Scrolling vs. Double- • White Space Banked • Authentication• Edits and Input Fields • Progress Indicators• Checkboxes and • Consistency Radio Buttons• Instructions and Help 73
    69. 69. Placement of Instructions• Place them near the item• “Don’t make me think”• Are they necessary? 74
    70. 70. Placement of Instructions• Place them near the item• “Don’t make me think”• Are they necessary? 75
    71. 71. Placement of Instructions• Place them near the item• “Don’t make me think”• Are they necessary? 76
    72. 72. Instructions• Key info in first 2 sentences• People skim – Rule of 2s: Key info in first two paragraphs, sentences, words 77
    73. 73. Instructions 78
    74. 74. Instructions 79
    75. 75. Placement of Clarifying Instructions• Help respondents have the same interpretation• Definitions, instructions, examples Conrad & Schober, 2000; Conrad et al., 2006; Conrad et al., 2007; Martin, 2002; Schober & 80 Conrad, 1997; Tourangeau et al., 2010
    76. 76. Placement of Clarifying Instructions 81 Redline, 2013
    77. 77. Placement of Clarifying Instructions• Percentage of valid responses was higher with clarification• Longer response time when before item• No effects of changing the font style• Before item is better than after• Asking a series of questions is best 82 Redline, 2013
    78. 78. Placement of Help• People are less likely to use help when they have to click than when it is near item• “Don’t make me think” 83
    79. 79. Placement of Error Message• Should be near the item• Should be positive and helpful, suggesting HOW to help• Bad error message: 84
    80. 80. Placement of Error Message• Should be near the item• Should be positive and helpful, suggesting HOW to help• Bad error message: 85
    81. 81. Error Message Across Devices 86
    82. 82. Error Message Across Devices 87
    83. 83. Web Survey Design• Paging vs. Scrolling • Graphics• Navigation • Emphasizing Text• Scrolling vs. Double- • White Space Banked • Authentication• Edits and Input Fields • Progress Indicators• Checkboxes and • Consistency Radio Buttons• Instructions and Help 88
    84. 84. Graphics• Improve motivation, engagement, satisfaction with “fun”• Decrease nonresponse & measurement error• Improve data quality• Gamification 89 Henning, 2012; Manfreda et al., 2002
    85. 85. Graphics• Use when they supply meaning – Survey about advertisements• Use when user experience is improved – For children or video-game players – For low literacy 90 Libman, 2012
    86. 86. Graphics 91
    87. 87. Graphicshttp://glittle.org/smiley-slider/http://www.ollie.net.nz/casestudies/smiley_slider/ 92
    88. 88. Graphics Experiment 1.1• Appearance – Decreasing boldness (bold  faded) – Increasing boldness (faded  bold) – Adding face symbols to response options (  )• ~ 2400 respondents• Rated satisfaction re health-related things• 5-pt scale: very satisfied  very dissatisfied 93 Medway & Tourangeau, 2011
    89. 89. Graphics Experiment 1.2 • Bold side selected more Very Somewhat Somewhat Very satisfied satisfied Neutral dissatisfied dissatisfiedYour physician O O O O O • Less satisfaction when face symbols present Very Very satisfied dissatisfied  Your physician O O O O O 94 Medway & Tourangeau, 2011
    90. 90. Graphics Experiment 2.1• Appearance – Radio buttons – Face symbols (    )• ~ 1000 respondents• Rated satisfaction with a journal• 6-pt scale: very dissatisfied  very satisfied 95 Emde & Fuchs, 2011
    91. 91. Graphics Experiment 2.2• Faces were equivalent to radio buttons• Respondents were more attentive when faces were present – Time to respond 96 Emde & Fuchs, 2011
    92. 92. Slider Usability Study• Participants thought 1 was selected and did not move the slider. 0 was actually selected if they did not respond. 97 Strohl, Romano Bergstrom & Krulikowski, 2012
    93. 93. Graphics Experiment 3.1• Modified the visual design of survey items – Increase novelty and interest on select items – Other items were standard• ~ 100 respondents in experimental condition• ~ 1200 in control• Questions about military perceptions and media usage• Variety of question types 98 Gibson, Luchman & Romano Bergstrom, 2013
    94. 94. Graphics Experiment 3.2• No differences 99 Gibson, Luchman & Romano Bergstrom, 2013
    95. 95. Graphics Experiment 3.3• Slight differences: – Those with enhanced version skipped more often – Those in standard responded more negatively. 100 Gibson, Luchman & Romano Bergstrom, 2013
    96. 96. Graphics Experiment 3.4• Slight differences: – Those with enhanced version skipped more often 101 Gibson, Luchman & Romano Bergstrom, 2013
    97. 97. Graphics Experiment 3.5• No major differences 102 Gibson, Luchman & Romano Bergstrom, 2013
    98. 98. Graphics Considerations• Mixed results• “Ad blindness”• Internet speed and download time• Unintended meaning 103
    99. 99. Graphics Considerations 104
    100. 100. Graphics Considerations 105
    101. 101. Graphics Considerations 106
    102. 102. Web Survey Design• Paging vs. Scrolling • Graphics• Navigation • Emphasizing Text• Scrolling vs. Double- • White Space Banked • Authentication• Edits and Input Fields • Progress Indicators• Checkboxes and • Consistency Radio Buttons• Instructions and Help 107
    103. 103. Emphasizing Text• Font – Never underline plain text – Never use red for plain text – Use bold and italics sparingly 108
    104. 104. Emphasizing Text 109
    105. 105. Emphasizing Text 110
    106. 106. Emphasizing Text• Hypertext – Use meaningful words and phrases – Be specific – Avoid “more” and “click here.” 111
    107. 107. Web Survey Design• Paging vs. Scrolling • Graphics• Navigation • Emphasizing Text• Scrolling vs. Double- • White Space Banked • Authentication• Edits and Input Fields • Progress Indicators• Checkboxes and • Consistency Radio Buttons• Instructions and Help 112
    108. 108. White Space• White space on a page• Differentiates sections• Don’t overdo it 113
    109. 109. White Space 114
    110. 110. Web Survey Design• Paging vs. Scrolling • Graphics• Navigation • Emphasizing Text• Scrolling vs. Double- • White Space Banked • Authentication• Edits and Input Fields • Progress Indicators• Checkboxes and • Consistency Radio Buttons• Instructions and Help 115
    111. 111. Authentication• Ensures respondent is the selected person• Prevents entry by those not selected• Prevents multiple entries by selected respondent 116
    112. 112. Authentication• Passive – ID and password embedded in URL• Active – E-mail entry – ID and password entry• Avoid ambiguous passwords (Couper et al., 2001) – E.g., contains 1, l, 0, o• Security concerns can be an issue• Don’t make it more difficult than it needs to be 117
    113. 113. Authentication 118
    114. 114. Web Survey Design• Paging vs. Scrolling • Graphics• Navigation • Emphasizing Text• Scrolling vs. Double- • White Space Banked • Authentication• Edits and Input Fields • Progress Indicators• Checkboxes and • Consistency Radio Buttons• Instructions and Help 119
    115. 115. Progress Indicators• Reduce breakoffs• Reduce burden by displaying length of survey• Enhance motivation and visual feedback• Not needed in scrolling design• Little evidence of benefit Couper et al., 2001; Crawford et al., 2001; Conrad et al., 2003, 2005; Sakshaug & 120 Crawford, 2009
    116. 116. Progress Indicators: At the bottom 121
    117. 117. Progress Indicators: At the top 122
    118. 118. Progress Indicators: Mobile 123
    119. 119. Progress Indicators• They should provide meaning 124 Strohl, Romano Bergstrom & Krulikowski, 2012
    120. 120. Web Survey Design• Paging vs. Scrolling • Graphics• Navigation • Emphasizing Text• Scrolling vs. Double- • White Space Banked • Authentication• Edits and Input Fields • Progress Indicators• Checkboxes and • Consistency Radio Buttons• Instructions and Help 125
    121. 121. Consistency• Predictable – User can anticipate what the system will do• Dependable – System fulfills user’s expectations• Habit-forming – System encourages behavior• Transferable – Habits in one context can transfer to another• Natural – Consistent with user’s knowledge 126
    122. 122. Inconsistency 127
    123. 123. Inconsistency 128
    124. 124. Inconsistency 129 Strohl, Romano Bergstrom & Krulikowski, 2012
    125. 125. Questions and Discussion
    126. 126. Assessing Your SurveyThe views expressed on statistical or methodological issues are those of the presenters and not necessarily those of the U.S. Census Bureau.
    127. 127. Assessing Your SurveyParadata Usability • Usability vs. User• Background Experience • Why, When, What?• Uses of Paradata by • Methods mode – Focus Groups, In-Depth Interviews• Paradata issues – Ethnographic Observations, Diary Studies – Usability & Cognitive Testing • Lab, Remote, In-the-Field • Obstacles
    128. 128. Paradata
    129. 129. Types of Data• Survey Data – collected information from R’s• Metadata – data that describes the survey – Codebook – Description of the project/survey• Paradata – data about the process of answering the survey at the R level• Auxiliary/Administrative Data – not collected directly, but acquired from external sources
    130. 130. Paradata• Term coined by Mick Couper – Originally described data that were by-products of computer-assisted interviewing – Expanded to include data from other self- administered modes• Main uses: – Adaptive / Responsive design – Nonresponse adjustment – Measurement error identification
    131. 131. Total Survey Error Framework Groves et al. 2004; Groves & Lyberg 2010
    132. 132. TSE Framework & Paradata Krueter, 2012
    133. 133. Adaptive / Responsive Design• Create process indicators• Real-time monitoring (charts & “dashboards”)• Adjust resources during data collection to achieve higher response rate and/or cost savings• Goal: – Achieve high response rates in a cost-effective way – Introduce methods to recruit uncooperative – and possibly different – sample members (reducing nonresponse bias)
    134. 134. Nonresponse Adjustment• Decreasing response rates have encouraged researchers to look at other sources of information to learn about nonrespondents – Doorstep interactions – Interviewer observations – Contact history data
    135. 135. Contact History Instrument (CHI)• CHI developed by the U.S. Census Bureau (Bates, 2003)• Interviewers take time after each attempt (refusal or non-contact) to answer questions in the CHI• Use CHI information to create models (i.e., heat maps) to identify optimal contact time• Typically a quick set of questions to answer• European Social Survey uses a standard contact form (Stoop et al., 2003)
    136. 136. Contact History Inventory (CHI)U.S. Census Bureau CHI
    137. 137. Paradata• Background information about Paradata• Uses of Paradata by mode• Paradata issues
    138. 138. Uses of Paradata by Mode• CAPI• CATI• Web• Mail• Post-hoc
    139. 139. Uses of Paradata - CAPI• Information collected can include: – Interviewer time spent calling sampled households – Time driving to sample areas – Time conversing with household members – Interview time – GPS coordinates (tablets/mobile devices)• Information can be used to: – Inform cost-quality decisions (Kreuter, 2009) – Develop cost per contact – Predict the likelihood of response by using interviewer observations of the response unit (Groves & Couper, 1998) – Monitor interviewers and identify any falsification
    140. 140. Uses of Paradata - CATI• Information collected can include: – Call transaction history (record of each attempt) – Contact rates – Sequence of contact attempts & contact rates• Information can be used to: – Optimize call back times – Interviewer monitoring – Inform a responsive design
    141. 141. Uses of Paradata - Web• Server-side vs. client-side• Information collected can include: – Device information (i.e., browser type, operating system, screen resolution, detection of JavaScript or Flash) – Questionnaire navigation information Callegaro, 2012
    142. 142. Web Paradata - Server-side• Page requests or “visits” to a web page from the web server• Identify device information and monitor survey completion
    143. 143. Web Paradata - Server-side cont.• Typology of response behaviors in web surveys 1. Complete responders 2. Unit non-responders 3. Answering drop-outs 4. Lurkers 5. Lurking drop-outs 6. Item non-responders 7. Item non-responding drop-outs Bosnjak, 2001
    144. 144. Web Paradata – Client-Side• Collected on the R’s computer• Logs each “meaningful” action• Heerwegh (2003) developed code / guidance for client-side paradata collected using Java-Script – Clicking on a radio button – Clicking and selecting a response option in a drop- down box – Clicking a check box (checking / unchecking) – Writing text in an input field – Clicking a hyperlink – Submitting the page
    145. 145. Web Paradata – Client-Side cont.• Stern (2008) used Heerwegh’s paradata techniques to identify: – Whether R’s changed answers; what direction – The order that questions are answered when more than one are displayed on the screen – Response latencies – the time that elapsed between when the screen loaded on the R’s computer and they submitted an answer • Heerwegh (2003) found that the longer the response time, the greater the probability of changing answers and an incorrect response
    146. 146. Browser Information / Operating System Information• Programmers use this information to ensure they are developing the optimal design• Desktop, laptop, smartphone, tablet, or other device• Sood (2011) found a correlation between browser type and survey breakoff & number of missing items – Completion rates for older browsers were lower – Using browser type as a proxy for age of device and possible connection speed – Older browsers were more likely to display survey incorrectly; possible explanation for higher drop-out rates
    147. 147. JavaScript & Flash• Helps to understand what the R can see and do in a survey• JavaScript adds functionality such as question validations, auto-calculations, interactive help – 2% or less of computer users have JavaScript disabled (Zakas, 2010)• Flash is used for question types such as drag & drop or slide-bar questions – Without Flash installed, R’s may not see the question
    148. 148. Flash Question Example
    149. 149. Questionnaire Navigation Paradata• Mouse clicks/coordinates – Captured with JavaScript – Excessive movements can indicate • An issue with the question • Potential for lower quality• Changing answers – Can indicate potential confusion with a question – Paradata can capture answers that were erased – Changes more frequent for opinion question than factual questions Stieger & Reips, 2010
    150. 150. Questionnaire Navigation Paradata cont.• Order of answering – When multiple questions are displayed on a screen – Can indicate how respondents read the questions• Movement through the questionnaire (forward and back) – Unusual patterns can indicate confusion and a possible issue with the questionnaire (i.e., poor question order)
    151. 151. Questionnaire Navigation Paradata cont.• Number of prompts/error messages/data validation messages• Quality Index (Haraldsen, 2005)• Goal is to decrease number of activated errors by improving the visual design and clarity of the questions
    152. 152. Questionnaire Navigation Paradata cont.• Clicks on non-question links – Help, FAQs, etc. – Indication of when and where Rs use help or other information built into the survey and displayed as a link• Last question answered before dropping out – Helps to determine if the data collected can be classified as complete, partial, or breakoff – Used for response rate computation – Peytchev (2009) analyzed breakoff by question type • Open ended increased break-off chances by 2.5x; long questions by 3x; slider bars by 5x; introductory screens by 2.6x
    153. 153. Questionnaire Navigation Paradata cont.• Time per screen / time latency – Attitude strength – Response uncertainty – Response error• Examples – Heerwegh (2003) • R’s with weaker attitudes take more time in answering survey questions than R’s with stronger attitudes – Yan and Tourangeu (2008) • Higher-educated R’s respond faster than lower-educated R’s • Younger R’s respond faster than older R’s.
    154. 154. Uses of Paradata – Call Centers• Self-administered (mail or electronic) surveys• Call transaction history software – Incoming calls • Date and time: useful for hiring, staffing, and workflow decisions • Purpose of the call – Content issue: useful for identifying problematic questions or support information – Technical issue: useful for identifying usability issues or system problems • Call outcome: type of assistance provided
    155. 155. Paradata• Background information about Paradata• Uses of Paradata by mode• Paradata issues
    156. 156. Paradata Issues• Reliability of data collected• Costs• Privacy and Ethical Issues
    157. 157. Reliability of data collected• Interviewers can erroneously record housing unit characteristics, misjudge features about respondents & fail to record a contact attempt• Web surveys can fail to load properly, and client-side paradata fails to be captured• Recordings of interviewers can be unusable (e.g., background noise, loose microphones) Casas-Cardero 2010; Sinibaldi 2010; West 2010
    158. 158. Paradata costs• Data storage – very large files• Instrument performance• Development within systems• Analysis
    159. 159. Privacy and Ethical Issues• IP addresses along with e-mail address or other information can be used to identify a respondent• This information needs to be protected
    160. 160. Paradata Activity• Should the respondent be informed that the organization is capturing paradata?• If so, how should that be communicated?
    161. 161. Privacy and Ethical Issues cont.• Singer & Couper asked members of the Dutch Longitudinal Internet Studies for the Social Sciences (LISS) panel at the end of the survey if they could collect paradata – 38.4% agreed• Asked before the survey – 63.4% agreed• Evidence that asking permission to use paradata might make R’s less willing to participate in a survey Couper & Singer, 2011
    162. 162. Privacy and Ethical Issues cont.• Reasons for failing to inform R’s about paradata or get their consent – Concept of paradata is unfamiliar and difficult for R’s to grasp – R’s associate it with the activities of advertisers, hackers or phishers – Asking for consent gives it more salience – Difficult to convey benefits of paradata for the R
    163. 163. Questions and Discussion
    164. 164. Usability Assessment
    165. 165. Usability Assessment• Usability vs. User Experience• Why, When, What?• Methods • Focus Groups, In-Depth Interviews • Ethnographic Observations, Diary Studies • Usability and Cognitive Testing• Lab, Remote, In-the-Field• Obstacles
    166. 166. Background Knowledge• What does usability mean to you?• Have you been involved in usability research?• How is “user experience” different from “usability?”
    167. 167. Usability Assessment• Usability vs. User Experience• Why, When, What?• Methods • Focus Groups, In-Depth Interviews • Ethnographic Observations, Diary Studies • Usability and Cognitive Testing• Lab, Remote, In-the-Field• Obstacles
    168. 168. Usability vs. User Experience• Usability: “the extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency, and satisfaction in a specified context of use.” ISO 9241-11• Usability.gov• User experience includes emotions, needs and perceptions.
    169. 169. Understanding UsersWhitney’s 5 E’s of Usability Peter’s User Experience Honeycomb The 5 Es to Understanding Users (W. Quesenbery): http://www.wqusability.com/articles/getting-started.html User Experience Design (P. Morville): http://semanticstudios.com/publications/semantics/000029.php
    170. 170. User Experience
    171. 171. Measuring the UX• How does it work for the end user?• What does the user expect?• How does it make the user feel?• What is the user’s story and habits?• What is the user’s needs?
    172. 172. What people do on the Web Krug, S. Don’t Make Me Think
    173. 173. Usability Assessment• Usability vs. User Experience• Why, When, What?• Methods • Focus Groups, In-Depth Interviews • Ethnographic Observations, Diary Studies • Usability and Cognitive Testing• Lab, Remote, In-the-Field• Obstacles
    174. 174. Why is Testing Important?• Put it in the hands of the users.• Things may seem straightforward to you but maybe not to your users.
    175. 175. Why is Testing Important?• Put it in the hands of the users.• Things may seem straightforward to you but maybe not to your users.
    176. 176. Why is Testing Important?
    177. 177. Why is Testing Important?• Put it in the hands of the users.• Things may seem straightforward to you but maybe not to your users.• You might have overlooked something big!
    178. 178. When to test Test Test Test Final Test with Concept Prototype withusers Test Test Product users Test
    179. 179. What can be tested?• Existing surveys• Low-fidelity prototypes – Paper mockups or mockups on computer – Basic idea is there but not functionality or graphical look• High-fidelity prototypes – As close as possible to final interface in look and feel
    180. 180. Usability Assessment• Usability vs. User Experience• Why, When, What?• Methods • Focus Groups, In-Depth Interviews • Ethnographic Observations, Diary Studies • Usability and Cognitive Testing• Lab, Remote, In-the-Field• Obstacles
    181. 181. Methods to Understand Users ensure users understand Ethnographi Usability can use interactions in products c Testing natural efficiently & w Observation environment satisfaction assess assess User ensure content Linguistic emotions, per interactionalCognitive Experience is understood ceptions, and motivations Testing Research Analysis as intended reactions and goals Focus randomly Groups and discuss users’ sample the Surveys perceptions In-Depth population of and reactions Interviews interest Method Assessme nt
    182. 182. Focus Groups• Structured script• Moderator discusses the survey with actual or typical users – Actual usage of survey – Workflow beyond survey – Expectations and opinions – Desire for new features and functionality• Benefit of participants stimulating conversations, but risk of “group think”
    183. 183. In-Depth Interviews• Structured or unstructured• Talk one-on-one with users, in person or remotely – Actual usage of the survey – Workflow beyond survey – Expectations and opinions – Desire for new features and functionality
    184. 184. Usability Assessment• Usability vs. User Experience• Why, When, What?• Methods • Focus Groups, In-Depth Interviews • Ethnographic Observations, Diary Studies • Usability and Cognitive Testing• Lab, Remote, In-the-Field• Obstacles
    185. 185. Ethnographic Observations• Observe users in home, office or any place that is “real- world.”• Observer is embedded in the user’s culture.• Allows conversation & activity to evolve naturally, with minimum interference.• Observe settings and artifacts (other real-world objects).• Focused on context and meaning making.
    186. 186. Diaries/Journals• Users are given a journal or a web site to complete on a regular basis (often daily).• They record how/when they used the survey, what they did, and what their perceptions were. • User-defined data • Feedback/responses develop and change over time • Insight into how technology is used “on-the-go.”• There is often a daily set of structured questions and/or free-form comments.
    187. 187. Diaries/Journals• Users are given a journal or a web site to complete on a regular basis (often daily).• They record how/when they used the survey, what they did, and what their perceptions were. • User-defined data • Feedback/responses develop and change over time • Insight into how technology is used “on-the-go.”• There is often a daily set of structured questions and/or free-form comments.
    188. 188. Usability Assessment• Usability vs. User Experience• Why, When, What?• Methods • Focus Groups, In-Depth Interviews • Ethnographic Observations, Diary Studies • Usability and Cognitive Testing• Lab, Remote, In-the-Field• Obstacles
    189. 189. Usability Testing•Participants respond to survey items•Assess interface flow and design • Understanding • Confusion • Expectations•Ensure skip intricate response patterns work as intended•Can test final product or early prototypes
    190. 190. Cognitive Testing•Participants respond to survey items•Assess text • Confusion • Understanding • Thought process•Ensure questions are understood as intended and resulting data is valid•Proper formatting is not necessary.
    191. 191. Usability vs. Cognitive Testing Usability Testing Metrics Cognitive Testing Metrics• Accuracy • Accuracy • In completing item/ survey • Of interpretations • Number/severity of errors • Verbalizations• Efficiency • Time to complete item/survey • Path to complete item/survey• Satisfaction • Item-based • Survey-based • Verbalizations
    192. 192. Moderating TechniquesTechniques Pros ConsConcurrent Think Aloud Understand participants’ Can interfere with usability thoughts as they occur and as metrics, such as accuracy and(CTA) they attempt to work through time on task issues they encounter Elicit real-time feedback and emotional responsesRetrospective Think Aloud Does not interfere with usability Overall session length increases metrics Difficulty in remembering(RTA) thoughts from up to an hour before = poor dataConcurrent Probing Understand participants’ Interferes with natural thought thoughts as they attempt to process and progression that(CP) work through a task participants would make on their own, if uninterruptedRetrospective Probing Does not interfere with usability Difficulty in remembering = poor metrics data(RP) Romano Bergstrom, Moderating Usability Tests: http://www.usability.gov/articles/2013/04/moderating-usability-tests.html
    193. 193. Choosing a Moderating Technique• Can the participant work completely alone?• Will you need time on task and accuracy data?• Are the tasks multi layered and/or require concentration?• Will you be conducting eye tracking?
    194. 194. Tweaking vs. RedesignTweaking Redesign• Less work • Lots of work after much has• Small changes occur quickly. already been invested• Small changes are likely to • May break something else happen. • A lot of people • A lot of meetings
    195. 195. Usability Assessment• Usability vs. User Experience• Why, When, What?• Methods • Focus Groups, In-Depth Interviews • Ethnographic Observations, Diary Studies • Usability and Cognitive Testing• Lab, Remote, In-the-Field• Obstacles
    196. 196. Lab vs. Remote vs. In the Field• Controlled environment Laboratory Remote In the Field• All participants have the • Participants in their • Participants tend to be same experience natural environments more comfortable in• Record and (e.g., home, work) their natural communicate from environments • Use video chat control room (moderated sessions) • Recruit hard-to-reach• Observers watch from or online programs populations (e.g., control room and (unmoderated) children, doctors) provide additional • Conduct many sessions • Moderator travels to probes (via moderator) quickly various locations in real time • Recruit participants in • Bring equipment (e.g.,• Incorporate many locations (e.g., eye tracker) physiological measures states, countries) (e.g., eye tracking, • Natural observations EDA)• No travel costs
    197. 197. Lab-Based Usability Testing Participant in the testing room Live streaming close-up screen shot of the participant’s screen 0 Participant in the testing Large screens room to display material Observation during focus area for groupsWe maneuver the clientscameras, record, andcommunicate throughmicrophones andspeakers from thecontrol room so we donot interfere Fors Marsh Group UX Lab
    198. 198. Eye Tracking• Desktop• Mobile• Paper Fors Marsh Group UX Lab
    199. 199. Remote Moderated Testing Observer taking notes, remain s unseen fromModerator participantworking fromthe officeParticipantworking onthe surveyfrom herhome inanother state Fors Marsh Group UX Lab
    200. 200. Field Studies Participant uses books from her natural environmentResearcher to completegoes to tasks on theparticipant’s websiteworkplace toconductsession. Sheobserves andtakes notes Participant is in her natural environment, completing tasks on a site she normally uses for work
    201. 201. Usability Assessment• Usability vs. User Experience• Why, When, What?• Methods • Focus Groups, In-Depth Interviews • Ethnographic Observations, Diary Studies • Usability and Cognitive Testing• Lab, Remote, In-the-Field• Obstacles
    202. 202. Obstacles to Testing• “There is no time.” – Start early in development process. – One morning a month with 3 users (Krug) – 12 people in 3 days (Anderson Riemer) – 12 people in 2 days (Lebson & Romano Bergstrom)• “I can’t find representative users.” – Everyone is important. – Travel – Remote testing• “We don’t have a lab.” – You can test anywhere.
    203. 203. Final Thoughts• Test across devices. • “User experience is an ecosystem.”• Test across demographics. • Older adults perform differently than young.• Start early. Kornacki, 2013, The Long Tail of UX
    204. 204. Questions & Discussion
    205. 205. Quality of Mixed ModesThe views expressed on statistical or methodological issues are those of the presenters and not necessarily those of the U.S. Census Bureau.
    206. 206. Quality of Mixed Modes• Mixed Mode Surveys• Response Rates• Mode Choice 213
    207. 207. Mixed Mode Surveys• Definition: Any combination of survey data collection methods/modes• Mixed vs. Multi vs. Multiple – Modes• Survey organization goal: – Identify optimal data collection procedure (for the research question) – Reduce Total Survey Error – Stay within time/budget constraints 214
    208. 208. Mixed Mode Designs• Sequential – Different modes for different phases of interaction (initial contact, data collection, follow-up) – Different modes used in sequence during data collection (i.e., panel survey which begins in one mode and moves to another)• Concurrent – different modes implemented at the same time 215 de Lueew & Hox, 2008
    209. 209. Why Do Mixed Mode?• Cost savings• Improve Timeliness• Reduces Total Survey Error – Coverage error – Nonresponse error – Measurement error 216
    210. 210. Mixed Modes – Cost Savings• Mixed mode designs give an opportunity to compensate for the weaknesses of each individual mode in a cost effective way (de Leeuw, 2005)• Dillman 2009 Internet, Mail, and Mixed-Mode Surveys book: – Organizations often start with lower cost mode and move to more expensive one • In the past: start with paper then do CATI or in person nonresponse follow-up (NRFU) • Current: start with Internet then paper NRFU 217
    211. 211. Mixed Modes – Cost Savings cont.• Examples: • U.S. Current Population Survey (CPS) – panel survey – Initially does in-person interview and collects a telephone number – Subsequent calls made via CATI to reduce cost • U.S. American Community Survey – Phase 1: mail – Phase 2: CATI NRFU – Phase 3: face-to-face with a subsample of remaining nonrespondents 218
    212. 212. Mixed Mode - Timeliness• Collect responses more quickly• Examples: – Current Employment Statistics (CES) offers 5 modes (Fax, Web, Touch-tone Data Entry, Electronic Data Interchange, & CATI) to facilitate timely monthly reporting 219
    213. 213. Why Do Mixed Mode?• Cost savings• Improve Timeliness• Reduces Total Survey Error – Coverage error – Nonresponse error – Measurement error 220
    214. 214. Total Survey Error Framework Groves et al. 2004; Groves & Lyberg 2010
    215. 215. Mixed Mode - Coverage Error• Definition: proportion of the target population that is not covered by the survey frame and the difference in the survey statistic between those covered and not covered• Telephone penetration • Landlines vs mobile phones – Web penetration 222 Groves, 1989
    216. 216. Coverage – Telephone• 88% of U.S. adults have a cell phone• Young adults, those with lower education, and lower household income more likely to use mobile devices as main source of internet access 223 Smith, 2011; Zickuhr & Smith, 2012
    217. 217. Coverage - Internet• Coverage is limited – No systematic directory of addresses• 1 in 5 in U.S. do not use the Internet 224 Zickuhr & Smith, 2012
    218. 218. 225
    219. 219. World Internet Statistics 226
    220. 220. Coverage –Web cont.• Indications that Internet adoption rates have leveled off• Demographics least likely to have Internet – Older – Less education – Lower household income• Main reason for not going online: not relevant 227 Pew, 2012
    221. 221. European Union – Characteristics of Internet Users 228
    222. 222. Coverage - Web cont.• R’s reporting via Internet can be different from those reporting via other modes – Internet vs. mail (Diment & Garret-Jones, 2007; Zhang, 2000)• R’s cannot be contacted through the Internet because e-mail addresses lack structure for generating random samples (Dillman, 2009) 229
    223. 223. Mixed Mode – Nonresponse Error• Definition: inability to obtain complete measurements on the survey sample (Groves, 1998) – Unit nonresponse - entire sampling unit fails to respond – Item nonresponse – R’s fail to respond to all questions• Concern is that respondents and non- respondents may differ on variable of interest 230
    224. 224. Mixed Mode – Nonresponse cont.• Overall response rates have been declining• Mixed mode is a strategy used to increase overall response rates while keeping costs low• Some R’s have a mode preference (Miller, 2009) 231
    225. 225. Mixed Mode – Nonresponse cont.• Some evidence of a reduction in overall response rates when multiple modes offered concurrently in population/household surveys – Examples: Delivery Sequence File Study (Dillman, 2009); Arbitron Radio Diaries (Gentry, 2008), American Communities Survey (Griffen, et al, 2001), Survey of Doctorate Recipients (Grigorian & Hoffer, 2008)• Could assign R’s to modes based on known preferences 232
    226. 226. Mixed Mode – Measurement Error• Definition: “observational errors” arising from the interviewer, instrument, mode of communication, or respondent (Groves, 1998)• Providing mixed modes can help reduce the measurement error associated with collecting sensitive information – Example: Interviewer begins face-to-face interview (CAPI) then lets R continue on the computer with headphones (ACASI) to answer sensitive questions 233
    227. 227. Mode Comparison Research• Meta-analysis of articles by – Harder to get mail responses – Overall non-response rates & item non-response rates are higher in self-administered questionnaires, BUT answered items are of high quality – Small difference in quality between face-to-face and telephone (CATI) surveys. – Face-to-face surveys had slightly less item non- response rates 234 de Leeuw, 1992
    228. 228. Mode Comparison Research cont.• Question order and response order effects less likely in self-administered than telephone – R’s more likely to choose last option heard in CATI (recency effect) – R’s more likely to choose the first option seen in self-administered (primacy effect) – Mixed results on item-nonresponse rates in Web 235 de Leeuw, 1992; 2008
    229. 229. Mode Comparison Research cont.• Some indication that Internet surveys are more like mail than telephone surveys – Visual presentation vs auditory• Conflicting evidence item non-response (some show higher item non-response on Internet v.s. mail while others show no difference)• Some evidence of better quality data – Fewer post-data collection edits needed for electronic v.s. mail responses 236 Sweet & Ramos, 1995; Griffin et. al, 2001
    230. 230. Disadvantages of Mixed Mode• Mode Effects – Concerns for measurement error due to the mode • R’s providing different answers to the same questions displayed in different modes – Different contact/cooperation rates because of different strategies used to contact R’s 237
    231. 231. Disadvantages of Mixed Mode• Decrease in overall response rates – Why: Effects of offering a mix of mail/web mixed – What: Meta-analysis of 16 studies that compared mixed mode surveys with mail and web options Results: empirical evidence that offering mail and Web concurrently resulted in a significant reduction in response rates 238 Medway & Fulton, 2012
    232. 232. Response Rates in Mixed Mode Surveys• Why this is happening? – Potential Hypothesis #1: R’s dissuaded from responding because they have to make a choice • Offering multiple modes increases burden (Dhar, 1997) • While judging pros/cons of each mode, neither appear attractive (Schwartz, 2004) – Potential Hypothesis #2: R’s choose Web, but never actually do it • If R’s receive invitation in mail, there is a break in their response process (Griffin, et. al, 2001) – Potential Hypothesis #3: R’s that choose Web may get frustrated with the instrument and abandon the whole process (Couper, 2000) 239
    233. 233. Overall Goals• Find the optimal mix given the research questions and population of interest• Other factors to consider: – Reducing Total Survey Error (TSE) – Budget – Time – Ethics and/or privacy issues 240 Biemer & Lyberg, 2003
    234. 234. Quality of Mixed Modes• Mixed Mode Surveys• Response Rates• Mode Choice 241
    235. 235. Technique for Increasing Response Rates to Web in Multi-Mode Surveys• “Pushing” R’s to the web – Sending R’s an invitation to report via Web – No paper questionnaire in the initial mailing – Invitation contains information for obtaining the alternative version (typically paper) – Paper versions are mailed out during follow-up to capture responses to those that do not have web access or do not want to respond via web – “Responding to Mode in Hand” Principal 242
    236. 236. “Pushing” Examples• Example 1: Lewiston-Clarkson Quality of Life Survey• Example 2: 2007 Stockholm County Council Public Health Survey• Example 3: American Community Survey• Example 4: 2011 Economic Census Re-file Survey 243
    237. 237. Pushing Example 1 –Lewiston-Clarkson Quality of Life Survey• Goals: increase web response rates in a paper/web mixed-mode survey and identify mode preferences• Method: – November 2007 – January 2008 – Random sample of 1,800 residential addresses – Four treatment groups – To assess mode preference, this question was at the end of the survey: • “If you could choose how to answer surveys like this, which one of the following ways of answering would you prefer?” • Answer options: web or mail or telephone 244 Miller, O’Neill, Dillman, 2009
    238. 238. Pushing Example 1 – cont.• Group A: Mail preference with web option – Materials suggested mail was preferred but web was acceptable• Group B: Mail Preference – Web option not mentioned until first follow-up• Group C: Web Preference – Mail option not mentioned until first follow-up• Group D: Equal Preference 245
    239. 239. Pushing Example 1 – cont.• Results 246
    240. 240. Pushing Example 1 – cont.“If you could choose how to answer surveys like this, which one of the following ways of answering would you prefer?” 247
    241. 241. Pushing Example 1 – cont. 248
    242. 242. Pushing Example 1 – cont.Group C = Web Preference Group 249
    243. 243. Pushing Example 1 – cont.• Who can be pushed to the Web? 250
    244. 244. Pushing Example 2 – 2007 Stockholm County Council Public Health Survey• Goal: increase web response rates in a paper/web mixed-mode survey• Method: – 50,000 (62% response rate) – 4 treatments that varied in “web intensity” – Plus a “standard” option – paper and web login data 251 Holmberg, Lorenc, Werner, 2008
    245. 245. Pushing Example 2 – Cont.• Overall response ratesS= Standard A1= very paper “intense”A2= paper “intense” A3= web “intense”A4= very web “intense” 252
    246. 246. Pushing Example 2 – Cont.• Web responsesS= Standard A1= very paper “intense”A2= paper “intense” A3= web “intense”A4= very web “intense” 253
    247. 247. Pushing Example 3 – American Community Survey• Goals: – Increase web response rates in a paper/web mixed-mode survey – Identify ideal timing for non-response follow-up – Evaluate advertisement of web choice 254 Tancreto et. al., 2012
    248. 248. Pushing Example 3 – Cont.• Method – Push: 3 versus 2 weeks until paper questionnaire – Choice: Prominent and Subtle – Mail only (control) – Tested among segments of US population • Targeted • Not Targeted 255
    249. 249. Response Rates by Mode in Targeted Areas45 40.540 38.1 38.2 37.635 31.1 12.530 2.525 28.420 34.1 38.115 28.6 28.0105 9.8 3.50 Ctrl (Mail only) Prom Choice Subtle Choice Push (3 weeks) Push (2 weeks) Internet Mail 256
    250. 250. Response Rates by Mode in Not Targeted Areas50454035 29.7 30.4 29.9 29.83025 19.8 12.620 24.1 2.715 29.7 27.810 17.1 17.2 5 6.3 0 2.1 Ctrl (Mail only) Prom Choice Subtle Choice Push (3 weeks) Push (2 weeks) Internet Mail 257
    251. 251. Example 4: Economic Census Refile• Goal: to increase Internet response rates in a paper/Internet establishment survey during non-response follow-up• Method: 29,000 delinquent respondents were split between two NRFU mailings – Letter-only mailing mentioning Internet option – Letter and paper form mailing 258 Marquette, 2012
    252. 252. Example 4: Cont. 259
    253. 253. Quality of Mixed Modes• Mixed Mode Surveys• Response Rates• Mode Choice 260
    254. 254. Why Respondents Choose Their Mode?• Concern about “mode paralysis” – When two option are offered, R’s much choose between tradeoffs – This choice makes each option less appealing – By offering a choice between Web and mail; possibly discouraging response 261 Miller and Dillman, 2011
    255. 255. Mode Choice• American Community Survey – Attitudes and Behavior Study• Goals: – Measure why respondents chose the Internet or paper mode during the American Community Survey Internet Test – Why there was nonresponse and if it was linked to the multi-mode offer 262 Nichols, 2012
    256. 256. Mode Choice – cont.• CATI questionnaire was developed in consultation with survey methodologists• Areas of interest included: – Salience of the mailing materials and messages – Knowledge of the mode choice – Consideration of reporting by Internet – Mode preference 263
    257. 257. Mode Choice – cont.• 100 completed interviews per notification strategy (push 264
    258. 258. Mode Choice – cont.• Results – Choice/Push Internet respondents opted for perceived benefits – easy, convenient, fast – Push R’s noted that not having the paper form motivated them to use the Internet to report – Push R’s that reported via mail did so because they did not have the Internet access or had computer problems – The placement of the message about the Internet option was reasonable to R’s – R’s often recalled the letter that accompanied the mailing package mentioning the mode choice 265
    259. 259. Mode Choice – cont.• Results cont. – Several nonrespondents cited not knowing that a paper option was available as a reason for not reporting – Very few nonrespondents attempted to access the online form – Salience of the mailing package and being busy were main reasons for nonresponse – ABS study did NOT find “mode paralysis” 266
    260. 260. Questions and Discussion
    261. 261. Amy Anderson Riemer US Census Bureauamy.e.anderson.riemer@census.gov Jennifer Romano Bergstrom Fors Marsh Group jbergstrom@forsmarshgroup.com

    ×