More Related Content Similar to How to get started in text analytics in market research (20) How to get started in text analytics in market research1. ©Unpickle, 2019. All Rights Reserved - Privileged and Confidentialwww.unpickle.in
How to get started in Text Analytics in MR
Milind Kelkar
March 05, 2019
<connect@unpickle.in>
1
2. ©Unpickle, 2019. All Rights Reserved - Privileged and Confidential
Text Data: What is it?
2
Interviewer or
respondent data
recorded and
transcribed.
Text features
extracted from
‘Show Card’, ads
and collage image
files.
Text data in web,
transcripts, open
ended responses,
and secondary
research reports.
3. ©Unpickle, 2019. All Rights Reserved - Privileged and Confidential
Sources of Text Data in Market Research
3
STRUCTURED
INTERNAL
UNSTRUCTUREDSEMI STRUCTURED
Audio
Transcript
Retail Pack Image
PDF JPEG
Interview Notes in
Word
Web Pages
Historical presentation in PDF
Log FilesDelimited Text File
Excel Inconsistent –
Output files
Excel Consistent –
Survey data
Databases
REAL TIME
EXTERNAL
STATIC
4. ©Unpickle, 2019. All Rights Reserved - Privileged and Confidential
Sources of Text Data in Market Research
4
• Feedback Analyses
– Customer Satisfaction
– Balance Scorecard Analysis
• Testing
– Concept and Product Studies
– Advertising Responses
• Qualitative
– Focus Group and In-depth interviews
– Secondary Research
– Anthropological Studies
• Monitoring
– Social Media Analyses
5. ©Unpickle, 2019. All Rights Reserved - Privileged and Confidential
Text Analytics and AI are enabling automation
5
Do-It-Yourself
(DIY)
Self-service
capability saves
time, provides
flexibility and
improves accuracy.
3Text Analytics &
Machine Learning
Computerized
“Mind reading” tool
discover hidden
patterns and
summarizes
voluminous text
responses
1 Open Source
Technologies
2 Death of Distance
Centralise, update,
and collaborate
with team members
and customers
4
Increased access to
lower cost
capabilities
enabling creation of
customised and
secure platform
6. ©Unpickle, 2019. All Rights Reserved - Privileged and Confidential
Automaton mimics Intelligent Behavior
6
• Respondent
• Researcher
• Analyst
• Coders
• Comprehension
• Judgment
• Reasoning
• Knowledge
Modified from CSUDH Computer Science Department
Automation
7. ©Unpickle, 2019. All Rights Reserved - Privileged and Confidential
Text Analytics is core component of AI
7
AI Components
Expert System
Images
Audio
OCR TextPre-
Processing
Speech2Text
Video
Natural
Language
Understanding
Processing
Generation
Machine
Learning
Automated
Behavior
Tools
Dictionary
8. ©Unpickle, 2019. All Rights Reserved - Privileged and Confidential
Taming the Text Data before it is Analysed
8
Text Preprocessing
– Cleanup
– Parts of Speech
– Stop words
Text Transformation
• Bag of Words
• TFIDF
• Vector Space
Feature Selection
• Dimension
Data Mining
• Named Entity
• Summarization
• Sentiment
Ref: Not known
9. ©Unpickle, 2019. All Rights Reserved - Privileged and Confidential
Text Transformation Example
9
Cleanup Example
Spelling
Mistakes
User ~ suer
Stemming car, cars, car's, cars’ -> car
Stop words “is”, “are”, “on” “the”, “at”, “who”, “when”
Capital letters Field vs. field
10. ©Unpickle, 2019. All Rights Reserved - Privileged and Confidential
Text Analytics Maturity Curve in MR
10
Mature
Image to
Text
Sentiment
Analysis
Chatbots
Work-in-
Progress
Evolving
Intelligent
Search
Verbatim
Coding
Language
Translation
Data Representation
Virtual Assistant
Data Management Data Reduction
Generate
Sentences
Qual
Quant
Digital
Voice to
TextField
11. ©Unpickle, 2019. All Rights Reserved - Privileged and Confidential
Text Analytics is poised to make an incredible impact
on Market Researchers
11
• Quantitative data tells you what
respondents did
• Text data can tell you much more -
motivations, occasions,
accompaniments…
12. ©Unpickle, 2019. All Rights Reserved - Privileged and Confidential
How to get started in Text Analytics in MR
12
Design End-
State
Explore, Quantifying,
Automation
Input source, Language
Output/Consumption
Text Analytics
ROI
Build Your
Own Stack
Commercial
Tools
Cloud API
Volume vs. User
Concurrency
Static vs. Real-time
Open Source
Tools
Business Use Case
Effort vs. Benefit trade-off
Change Management
Python, R, StanfordNLP,
OpenNLP
Sand-box environment
Database
Big Data
Computational Linguistic
Programming Language
Scripting Language
Machine Learning
13. ©Unpickle, 2019. All Rights Reserved - Privileged and Confidential
T
©Unpickle, 2019. All Rights Reserved - Privileged and Confidential
Thank You
13