Presented at GOR 23 on 21 September 2023. We present four uses cases for applying Generative AI, such as ChatGPT / GPT-4, to the quantitative analysis of survey data, including examples of prompts and generated output.
This work was presented in the session on 'Innovation in Practice (2): Opportunities and Challenges' at the General Online Research Conference 2023 (https://www.conftool.org/gor23/index.php?page=browseSessions&form_session=77). Supporting write-ups are available at https://www.inspirient.com/case-studies/survey_analysis_automation.php and https://www.inspirient.com/case-studies/survey_quality_assurance.php
-- Full abstract below --
Relevance & Research Question:
Rigorous, well-documented data analysis is a corner stone of opinion research. ChatGPT and GPT-4 are the most recent additions to the toolset of every researcher/social scientist and are thought to have the potential to boost efficiencies by 50% and more – but they have trouble reliably adding more than three numbers, let alone apply complex statistical methods! This talk takes a use-case-driven approach to answering the following research questions: How can current Large Language Models (LLMs), like ChatGPT / GPT-4, be applied to rigorous analysis of survey data? Which use cases are feasible, and under which tradeoffs?
Methods & Data:
We've jointly identified four major use cases for LLMs applied to the analysis of raw survey data: 1) Natural-language data exploration, 2) desk research support, 3) generation of deliverables, and 4) LLM chatbots as client-facing deliverables. For each of these use cases, we've preprocessed samples of raw survey data in a fully automated process to allow current LLMs to be applied, as per the use case requirements. We've validated the textual results across teams to understand their respective potential and shortcomings, and consequently improved both our automated preprocessing as well as LLM prompting.
Results:
The results across the four use cases are impressive: Given nothing more than raw survey data (in form of an SPSS or Microsoft Excel file), we were able to reliably use LLMs to produce relevant output for each use case. The key aspect of our approach is the capability to link each LLM statement back to its original analytical results, thereby establishing a chain of trust into each statement of the LLM conversation that regular LLM use would lack otherwise.
Added Value:
At this time it's hard to fully estimate the automation potential&resulting efficiency gains for research agencies. We can however state with confidence, that the provided samples for all four use cases were generated within about one hour, for the given SPSS/Excel input data. In our talk, we'll demonstrate this process step-by-step, and together – based on a live demo – explore potential and limitations of this approach.
Trustworthy Analytics with Generative AI: Four Use Cases for ChatGPT / GPT-4
1. Trustworthy Analytics with Generative AI:
Four Use Cases for ChatGPT / GPT-4
Dr. Georg Wittenburg ▪ Inspirient
General Online Research Conference 2023 (GOR 23)
Kassel, 21 September 2023