Building Surveys in Qualtrics for Efficient Analytics

Building Surveys in
Qualtrics® for
Efficient Analytics

• Qualtrics® is a state-of-the-art online research suite which enables
sophisticated data collection and analytics. This presentation will describe
how to build a survey for efficient analytics, both within Qualtrics® and
outside Qualtrics®. This presentation emphasizes the importance of
thinking through the data collection, the analytics, and the data
presentation, in order to build a survey instrument that works for the
research context. Along the way, some of the cutting-edge survey-
building capabilities of Qualtrics® (including rich question types, invisible
questions, branching logic, display logic, panel triggers, and others), will
be showcased along with the data analytics functionalities (including
cross-tab analysis and data visualizations).
Presentation Overview
2

4
• Respondent sampling
• Data capture
• Data cleaning
• Data analytics
– Evaluation of survey instrument and research method
– Evaluation of targeted data
• Manual coding
• Data queries
• Autocoding
• Data visualizations
• Discussion
– Including design of survey instrument and its testing
• Write-up and presentation
• Review of the literature
• Research design
• Design of research instrument (such as a survey
instrument)
– Pilot testing
– Reliability (usu. based on time-separated testing for tests
of internal consistency, also Cronbach’s alpha)
– Validity [of construct(s)] (eliminating multicollinearity, with
factor analysis testing; eliminating questions which do not
contribute meaningfully to a construct; going with top
components in a scree plot to hone an instrument’s focus
on meaningful constructs)
• Institutional Review Board (IRB) oversight and
approval
– Informed consent (including length of time needed to
complete survey), control of respondent incentives,
beneficence, non-deception, not collecting excess
information, privacy protections, data handling and
preservation, and others
General Research Overview

5
Main Idea: Using Data Requirements to Design
the Survey Instrument

6
– Actual hypothesis testing (seeking
confirmation / seeking disconfirmation)
– Controlling for researcher subjectivity
multiple ways:
– using neutral tools and experimental methods
and empirical observations (per quant
methods); owning subjectivity and
– surfacing subjectivity and adjusting for
subjectivity (per qual methods)
– Neutral question design (non-leading
questions, non-leading examples, non-
leading imagery or multimedia)
• Close-ended questions with open-ended
elaboration and / or non-applicable opt-
outs Non-overlap of questions (no
multicollinearity)
• Clear language
• Trap questions for respondent honesty
tests; pattern tests; speed tests
• Saturation of literature review and clear
understandings
• Accepted (in the domain) research
design and research methods
– Optimally original but evolved methods
• A valid and reliable survey instrument
that has been pilot tested
– Comprehensive data collection (without
gaps)
– Fully accessible (particularly if multimedia
is used)
– Informed consent, participant opt-out
options
– Fully legal (no IP contravention, no privacy
contravention, no libel, no slander, no
defamation)
Let’s Assume

7
• Proper data handling (including a
keeping of a pristine master of all raw
data files)
• Data cleaning
– de-duplication
– de-noising (misspellings, non-English
languages)
– data harmonization
– rejection of ‘bot results, if any
– rejection of apparently falsified results
– sometimes elimination of outliers that pull
curves), and others
• Anonymity / confidentiality / identified
respondents (awareness of re-
identification possibilities such as
through IP capture)
• Full security (vetting respondents,
ensuring data integrity, protection
against non-manipulation of data, and
so on)
• IRB approval (or survey exemption),
oversight, and constructive feedback
• A valid sampling of survey respondents
• Accurate data capture
– Quantitative data (baselines)
– Qualitative data (“color”)
Let’s Assume (cont.)

8
• Insightful discussion
– Qualification of results
• Accurate write-up and
presentation
– Ceteris paribus (all things being
equal…or remaining the same)
• Thorough data analytics
– Analysis of the survey instrument
and survey respondent behavior
– Analysis of the survey contents
– Computer-based analytics
– Supplemental crowd-sourced
analytics
– Proper assertability (with the proper
amount of confidence)
• Statistical significance (ability to
reject the null hypothesis)
• Insight and relevance (based on
logically sound analysis)
Let’s Assume (cont.)

9
• What are some problems that may arise in the design of a survey
instrument when the survey designer is not sufficiently aware of data
analytics needs from the beginning?
• Starting from the resulting data…and working backwards to an efficient
survey…
1. Online survey-based data analytics
2. Data
3. Eliciting responses through online surveys
Controlling for a Limited Class of Problems

10
• A quality ask…will result in quality response data (and the converse is also
true: A low-quality ask…)
• Human time—spent cleaning data, transcribing data, interpreting data,
conducting calculations, manually coding data, computationally coding
data, and setting up queries—is expensive
• Data need to be in a basic machine-usable format for data queries,
autocoding, and data visualizations
– Computational insights stand to benefit human research
– Data contain more informational value than initially conceptualized
Some Basic Principles

11
DEMOs
Qualtrics® in
Research

12
• Respondent selection
• Overview of research
• Contact information for
researcher / research team
• Informed consent
• Content questions
• Demographic questions
• Conclusion
General Online Survey Instrument Sequence

13
• Constant Sum Questions
– Estimates usually summing to 100%
• Pick, Group and Rank Questions
• Hot Spot Questions
– Identification of locations on a 2D
image
• Heat Map Questions
– Spatial frequency visualization
• Graphic Slider Questions
• Text/Graphic Questions
• Multiple-Choice Questions
• Matrix Table Questions
• Text Entry Questions
– Also password-entry
• Slider Questions
– With visual interface
• Rank Order Questions
• Side by Side Questions
Question Types in Qualtrics®

14
• Hidden Questions (timing, meta
info: browser type, browser
version, operating system, screen
resolution, Flash version, Java
support, user agent)
• Captcha (Completely Automated
Public Turing test to tell Computers
and Humans Apart)
• File Upload Questions, and others
• Gap Analysis Questions
• Drill Down Questions
• Loop & Merge Questions
– Extracting responses from one prior
question to review and explore in
more depth in a subsequent
question
• Net Promoter Score Questions
(“customer loyalty”)
Question Types in Qualtrics® (cont.)

15
• Google Translate integration
• Security features
• Scoring (for uses in trainings, for
example)
• Offline archival of a Qualtrics®
survey (.qsf) and related data
(.csv) in a re-creatable way
• Branching logic
• Piped text (and customizations)
• Email and panel triggers
• Quotas and quota triggers
• Display logic
• Skip logic
• Loop & Merge
• Carry forward
• Default answers
Rich Features

16
• Reporting Tab -> Survey Statistics
Survey Statistics Data from Qualtrics®

17
• View Results -> View Reports
• View Results -> Responses (record-by-record)
• View Results -> Cross-Tabulation
Data Analytics on Qualtrics®

18
• View Results -> View Reports -> Downloads (by Data Table; en masse)
Downloading Data from Qualtrics®

19
• Can “Make Report Public” to a URL
• Can share interactive data dashboards
Data Dashboards in Qualtrics®

20
1. Online survey-based data analytics
2. Data
3. Eliciting responses through online surveys
Three Main
Takeaways

21
• Survey instrument and / or technology may be used at any point in a research
process and for a variety of purposes
• Captured data include data about the survey instrument, about the way the
survey was conducted, about related Web technologies, and about the
target data (the ostensible aim of the research)
• Data analytics include the following:
– manual coding (by theory, by framework, by models, by emergent / grounded
theory coding, and others),
– computerized data queries (word frequency counts, text searches, matrix queries,
cross-tabulation analysis, social network analysis, and others),
– autocoding (sentiment, theme and sub-theme extraction, word networks, and
others),
– data visualizations (geographical mapping, network analysis, cluster analysis,
treemaps, dendrograms, trends, and others),
– supplemental crowd-sourced analytics (as a commercial service)
Takeaway #1: About Online Survey-Based
Data Analytics

22
Data
• Data comes in various forms.
– Some are more readily analyzable than others (e.g. closed answers vs. open
answers, text vs. multimedia).
– Analog data would benefit by being digitized (turned into digital format through
photography, scanning, audio capture, video capture, and others)
• Anything scanned into text has to be searchable (machine readable)
– Open answers (in text format) require human and / or machine-coding
• Misspelling have to be corrected
• Non-English responses will have to be accurately translated (if the base language of the
analytics is English)
Takeaway #2: About Data

23
Data (cont.)
– Multimedia requires transcriptions (machine queries and codes require that the
data be in alphanumeric format)
– Some machine-based queries require that the quantitative data be in data
table format
– All data requires data cleaning (de-duplication; de-noising; rejection of ‘bot
results, if any; rejection of apparently falsified results; sometimes elimination of
outliers that pull curves)
Takeaway #2: About Data (cont.)

24
Data (cont.)
• Data invariably reveal more than is initially expected in the elicitation and
the sharing
Data Archival
• Pristine master archives of all raw files have to be preserved before any
data cleaning or analytics or anything else is done
• Data will have to be protected
• Data will have to be archived
– Over time, data will still have to be accessible
– Over time, data will still have to be clear (disambiguated)

25
Social Media Data
• Some open surveys use open calls via Facebook, Twitter, and other platforms
• To complement survey results, social media data may be tapped to
complement survey data
– Social media data may be scraped (Python, R)
– Social media data may be extracted using web browser add-ons (DownThemAll on
Firefox, Chronos Download Manager on Chrome, NCapture of NVivo) or third-party
tools that access the social media platform APIs (NodeXL)
• Social media data come in three types: content, metadata, and trace
– Content metadata include shared imagery, shared videos, and shared messaging
– Metadata (descriptive data about data) rides with virtually all digital files (exif data
on images, properties re: text files, and others)
– Trace (interaction data) data rides with virtually all social media content data
extractions

26
The Art of the Ask…in Online Surveys
• How a question is asked or an answer is elicited affects the quality of the
answer; similarly, how a survey is sequenced affects the respondent
experience and primes the individual for particular responses
• Surveys should be built in a consciously purposive and neutral way
– Questions / stories / examples / imagery should not be leading
• Survey elicitations should include some questions built to elicit responses
that may disprove the hypotheses and sub-hypotheses
• Work hard to avoid self-deception. It is okay to get responses that are confounding,
surprising, critical, and negative—as long as there is informational value in the captured
data.
• What do you want to know but are afraid to ask? Ask.
Takeaway #3: About Eliciting Responses
through Online Surveys

27
The Art of the Ask…in Online Surveys (cont.)
• Survey elicitations may include test questions that test the honesty of the
respondent; they may include multiple asks to verify sensitive questions
• Survey instruments should be pilot-tested with a sample group of those
similar to those who will participate in the actual research (all the way
through to data analysis, so the researchers know what the data look like)
– Post pilot-testing, the instrument should be revised as-needed
– Researcher should document how the instrument was
• created
• pilot-tested
• revised
• deployed
through Online Surveys (cont.)

28
The Art of the Ask…in Qualtrics®
• Ensure respondents are of-age and legal standing to participate if the
survey is an open one
• Use page breaks to pace a survey (and use a progress bar to inform
survey respondents)
• Draw attention to particular high-value questions in order to slow down
survey respondents and to focus their attention (in a neutral way)
• Use invisible questions if time spent on a question (which may be a
simulation or video) or the uses of particular technologies (browsers,
devices) may provide further insights
• Likewise, it is possible to know if a respondent came from Facebook or Twitter (or some
other social media platform which was used to elicit responses to an open-link survey)

29
The Art of the Ask…in Qualtrics® (cont.)
• Design a survey sequence to be as inclusive and comfortable for all as
possible to avoid dropouts
– Demographic data is not collected until the end, usually
• If branching (based on answers / attitudes / identities, and others), make
sure each branched sub-group has equal access to all relevant questions
(or the equivalency of all questions)
– Don’t accidentally create blind spots because of treating different subgroups
different
– When pilot-testing, do a full walk-through of every branch

30
• Use close-ended questions whenever possible especially when there is
mass-scale of responses, but design the answer selection to be
comprehensive
– If forcing responses, do not force choices by using close-ended questions which
are insufficiently broad and which do not include an open-ended text option
• Disambiguation of data starts with the survey
– Close-ended responses should be used for anything with pre-definition
(locations, rates, measures, and other data features)
• Leaving a close-ended query open-ended (such as a text entry question type) means
the introduction of noise and ambiguity (and the need to do interpretations and
calculations)

31
• Elicit open-source textual responses only when there is a large up-side in
terms of insights
– Elicit multimedia responses (photos, video shorts, audio) only when there is a
large up-side in terms of insights (high cost in coding multimedia data because
of high dimensionality)
• Contextualized questions—like matrix table questions, side by side
questions, gaps analysis questions, pick group and rank, and others—will
result in data tables that require disaggregation for analysis; some, like hot
spot and heat map questions, may require data visualizations along with
the numerical data
• Visual-based responses may introduce a level of ambiguity which may
need parsing afterwards; data reportage has to be fully disambiguated

32
About Asking for Cross-Tabulation Analysis
• Build each question to focus on a particular variable for ease of cross-tab
analysis and matrix queries
– Using multivariate questions will add noise to the data
• To enable the effective application of the cross-tabulation analytics
feature in Qualtrics®, each close-ended question should capture a
particular variable (not multiple variables)
• Demographic data should be captured to enable initial assertions of
respondent classifications, usually including gender, age, race, ethnicity,
education-level, geographical location, earnings, and behavioral
classifications (such as purchases) and attitudinal classifications (such as
beliefs and sentiments)

33
Given computational data queries and autocoding…
Conceptualizing
Generic Askable
Questions

34
• What percentage of a population has a certain experience / a certain
attitude / a certain belief towards a particular topic?
– Is there consensus? Dissensus? How so?
• How does the surveyed group (and subgroups) perceive a particular
issue? What are values issues at play? Personality issues? Policy issues?
– What is the strength of sentiment? What is the direction of sentiment?
– Based on sampling over time, how is an issue trending?
– How do group perceptions align with facts-on-the-ground?
• How well informed is the surveyed group (and subgroups) about particular
issues? How can one tell? (Where is their information apparently coming
from?)
Generic Askable Questions from Quant Data
from Surveys

35
• What are preferences for certain courses of action among survey
respondents? Certain aesthetics? Certain messaging?
– What are the various ratings for particular options? (such as on a Likert scale)
• What are areas of interest in a 2D image or map?
• What is perceived or experienced from a particular video, simulation, or
game?
Generic Askable Questions from Quant Data
from Surveys (cont.)

36
• What are some unique ways that respondents express themselves in
response to a question? (used for “color” to illuminate numbers, as
“exemplars” and “anomalies” in a research analysis and write-up)
• What is the text frequency of a particular concept or phrase or hashtag?
• What is the gist of a particular textual phrase, in terms of how it is used (in
every context)?
• What are word relationships found in a text corpus or set? What insights do
these word relationships show?
– What are the identified types of interrelationships between entities in a text
network?
• What are the observed emotions in a text set?
Generic Askable Questions from Qual Data
from Surveys

37
• What are the observed expressed sentiments in a text set? (positive-
negative)
• What are extracted themes and sub-themes from a text set?
• What other insights are available from computational “distant reading”
methods?
• In an imaginary scenario, how do respondents think they would act?
• What do survey respondents observe in provided videos?
– Given a particular visual or audio or audiovisual prompt, what do the
respondents see?
Generic Askable Questions from Qual Data
from Surveys (cont.)

38
• In a particular (image, audio, video, multimedia) set, what are some
common shared messages? Features? Concepts? Depictions? Tones?
Moods?
• Are there common categories or topical clusters? If so, how is the set
divided up in terms of categories?
• In a multimedia set, what various types of unique media formats are
there? How may the respective sub-groups of multimedia be described?
• Who are the main authors of multimedia contents? What is their apparent
purpose in sharing the multimedia?
• Who are the apparent audience members for multimedia? How is
messaging targeted to particular audiences?
Generic Askable Questions from Multimedia
Data

39
• What is the stylization of the (multimedia) data set?
• What technologies are used in the development of the multimedia data
set (by the original authors)?
• Are there geographical regionalisms in the multimedia set? Common
cultures? Common languages?
Generic Askable Questions from Multimedia
Data (cont.)

40
• Which social media user accounts are the most active in a group? In a
#hashtag network? In a keyword search?
• Who is behind a social media user account in terms of “profiling” based on
messaging and shared contents?
• Who are the members of a clique in a particular topical network? A
particular geographical network? A formal group? A hashtag (#)
network? Other types of networks?
– What are the assumed social dynamics in the clique (based on structure
analysis)?
• What messaging is being shared among a particular group on social
media?
Generic Askable Questions from Social Media
Data

41
• What are some common themes in a Tweetstream? #hashtag
conversation? Crowd-sourced article?
• What are some common sentiments (positive / negative) expressed in a
Tweetstream? #hashtag conversation? Crowd-sourced article?
• What are the interrelationships in an article network on Wikipedia? What
do these connections show about the original topic?
• What sorts of informal or “folk” tags are used to label social media
contents (like imagery and videos)?
– What sorts of related tags networks may be extracted?
– What sorts of image sets are extracted based on particular #hashtags?
Keywords? Tags?
– What sorts of video sets are extracted based on particular #hashtags?
Keywords? Tags?
Data (cont.)

42
• What are some of the issues-based discourse trajectories within a
particular social group or cluster? How does this discussion evolve over
time? What are critical moments in such discussions? What are the
conclusions regarding the issues (based on the respective subgroups)?
Data (cont.)

43
• Dr. Shalin Hai-Jew
• iTAC, Kansas State University
• 212 Hale Library
• 785-532-5262
• shalin@k-state.edu
• The presenter has no formal tie to Qualtrics, Inc.
Conclusion and Contact

Building Surveys in Qualtrics for Efficient Analytics

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (20)

Similar to Building Surveys in Qualtrics for Efficient Analytics

Similar to Building Surveys in Qualtrics for Efficient Analytics (20)

More from Shalin Hai-Jew

More from Shalin Hai-Jew (20)

Recently uploaded

Recently uploaded (20)

Building Surveys in Qualtrics for Efficient Analytics