pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
accm-brfss-2022-presentation-draft.pptx
1. Developing a (partially) Automated CallCenterMonitoring/Management ToolUsing MachineLearning and
Practical Insights
Matt Jans, Zoe Padgett, James Dayton, Don Allen, Josh Duell, Shawn
Underwood, Mary Penn, Dave Roe, Lew Berman
BRFSS Annual Meeting (Virtual)
3/24/22
3. General interviewquality assurance(QA) challenges
• Time and effort to review
- Organizations generally QA/QC 5-10% of workload (i.e., completes, dials, hours)1
- Small %, but most aren’t problematic
- Difficult to balance breadth and depth
3
1 Jans et al., (2018). Roundtable: What Makes a Good Interviewer? Metrics and Methods for Ensuring Data Quality, Federal Computer-assisted Survey Information Collection (FedCASIC) Workshop
4. Motivating questionsfor an automatedcall center monitoring/managementsystem
• Can we automate portions of our QA procedures?
• How do we monitor a larger percentage of work than we do with typical QA without increasing QA costs?
• Will automating mundane aspects of QA work allow QA staff to spend more time identifying more complex
problems, and coaching?
4
5. 5
• Comprehensive QA systems exist2
• Question text, audio recordings, and coding forms presented on a single screen
• Some automation3
• To our knowledge, no system has both elements
• Please tell us if you know of any!
Has anyone done this?
2 Thissen, M. R. (2014). Computer Audio-Recorded Interviewing as a Tool for Survey Research. Social Science Computer Review, 32(1), 90–104. https://doi.org/10.1177/0894439313500128
3 Timbrook, J., & Eck, A. (2019, February 26). Humans vs. Machines: Comparing Coding of Interviewer Question-Asking Behaviors Using Recurrent Neural Networks to Human Coders. 2019 Workshop:
Interviewers and Their Effects from a Total Survey Error Perspective, Lincoln, NE. (see full dissertation at https://www.proquest.com/openview/f3dd1734067619e90afe6d0d3c8a5c58/1?pq-
origsite=gscholar&cbl=44156)
6. 6
• Can we automate portions of our QA procedures?
• Using survey interview recordings
• Machine learning speech recognition and language understanding (i.e., natural language processing)
• How do we monitor a larger percentage of work than we do with typical QA without increasing QA costs?
• Human QA staff can only review so many recordings or parts of recordings each month
• Automatically identifying clear, simple problems so QA staff can spend more time on those that only humans can find
• Triaging and filtering problems for QA confirmation to simplify and standardize the QA task
• Ideally, “touching” 100% of interviews or calls, even if only on basic errors
• Will automating mundane aspects of QA work allow QA staff to spend more time identifying more complex
problems, and coaching?
• Long-term goal
How we started to answer these questions
8. 8
• Prior to 2021 – Testing and development phases
• Goals
• Establish how well speech recognition could identify…
• …interviewer and respondent turns and the words spoken
• ...misrecordings (i.e., interviewer doesn’t enter what respondent said)
• What we learned
• Works in situations with low audio fidelity (i.e., phone interviews, even mobile phones)
• Tweaking speech recognition and parsing algorithms takes time and skill
• Question and answer speech identified more reliably on simple questions
• Yes/No questions work best
• 2021 forward - Implementation
• Goal: Work ACCM into production QA review each month with Yes/No questions in BRFSS
• What we learned: Very low rates of misrecordings; More details in progress
ACCM development phases
11. 11
QA review process – all differences between ACCM and data record
What ACCM thinks
respondent said
What
interviewer
entered
12. 12
Adjudication process
QA Form (reduced) goes here
(Check my notes on what
stays/goes)
Include notes
“Description” only used when there’s a
discrepancy (update text in box)
16. 16
• Can we automate portions of our QA procedures?
• Yes, but with mixed empirical results
• Only 5.3% of ACCM-screened recordings identified as interviewer misrecording
• After QA review, only 0.2% (overall) misrecording (3.7% “machine errors”)
• 70% of cases flagged by ACCM
• Need to assess whether interviewer misrecording is worth automating
• How do we monitor a larger percentage of work than we do with typical QA without increasing QA costs?
• Core of the ACCM system still meets this goal
• Will automating mundane aspects of QA work allow QA staff to spend more time identifying more complex
problems, and coaching?
• Probably! No direct measure right now, but future directions are positive
How we continue to answer these questions
18. What’snext for ACCM
• Continue finding ways to review larger percentage of interviews
- And aspects of the interview most likely to reveal the largest problems
• Investigating introductions (first 30 seconds)
• Correlation between easy-to-measure metrics and hard-to-measure ones
- Correlate misrecordings and problems in first 30 seconds with other coded interviewer issues like verbatim reading and
neutral probing
20. ACCM Background
• Multiple small-scale pilots led by Lew Berman
• Major Finding
- High reliability between auto-coded responses and human-coded responses for simple questions with simple answer
categories (Berman, Boyle, Allen, Duell Jans, Iachan, McCoy, 2019)
• General self-rated health (S1Q1): 96% agreement (𝛋 = 0.94)
• Smoked 100 cigarettes (S9Q1): 74% agreement (𝛋 = 0.59)
• Seatbelt frequency (s13q1): 70% agreement (𝛋 = n/a)
• Employment (s8q15): 61% agreement (𝜅 = 0.50)
• Outdoor activity reduce/change (CT13_2): 20% agreement (𝜅 = n/a)
20
Editor's Notes
Background — One major challenge to interview(er) quality control is the volume interviewers, interviews, and interviewing time that must be monitored. In an ideal world, 100% of the time that interviewer spends dialing and interviewing would be monitored. Because that is not possible, most data collectors monitor a small fraction of interviewers, interviews, or interviewing time, with specific approaches varying widely throughout the industry.
Objectives — We present ICF’s progress developing a system that automates parts of the quality assurance review process to a) increase the number of interviews touched by QA review, and b) relieve QA staff of some of the manual work required to review interviews and interviewers so that they can spend more time identifying and verifying problems that can’t be automatically-identified, and coaching interviewers.
Methods — Using digital audio recordings of BRFSS interviews, ICF has developed a method of automatically identifying interviews with potential quality problems, specifically that the interviewer entered a different response than what the respondent gave. These errors may be indicative of other interview or interviewer quality issues.
Results — We present a general discussion of the development efforts behind this project, efforts taken to validate the system, and initial applications.
Conclusions — We summarize lessons learned and present future directions of ACCM expansion.
Objectives:
Develop a system that automates parts of the quality assurance review process to a) increase the number of interviews touched by QA review, and b) relieve QA staff of some of the manual work required to review interviews and interviewers so that they can spend more time identifying and verifying problems that can’t be automatically-identified, and coaching interviewers.
Methods — Using digital audio recordings of BRFSS interviews, ICF has developed a method of automatically identifying interviews with potential quality problems, specifically that the interviewer entered a different response than what the respondent gave. These errors may be indicative of other interview or interviewer quality issues.