Introduction to Text Analytics
October 2, 2013
Dr. Stuart Shulman
Phone No.: +1-413-345-8939
E-mail: stu.shulman@visioncritical.com
The Value Proposition
Our solution helps users easily discover information to:
• streamline business processes
• increase ROI & create new business opportunities
• identify positive and negative trends
• discover unique, rare or unexpected information
How Do These Tools Help Analysts?
What This Means forAnalysis
The Core Methods
Coding and Classifying Text Data
Iteration and Re-Use Are Critical Techniques
Measure Everything Starting With Human Agreement
The Core DiscoverTextApproach
An Indispensable Role for Humans
Innovation Happens in Groups
“CoderRank” – A LifetimeAccuracy Measurement
Vision Critical Patent Pending – “Enhanced Machine Learning”
Five Essential Tools for TextAnalytics
1. Search
2. Filtering on Metadata
3. Human Coding
4. Automated Clustering
5. Machine Classification
A Social Media Use Case
Sifting and Sorting Relevant Data
Great Researchers Demand Transparent Tools
The HMC is a Leading Edge Gnip Customer
Gnip Data Streams and Search Filters
Fair Warning
This part of the presentation contains strong and
potentially quite
offensive, inappropriate, disturbing, or just
completely stupid language.
Studying Media Campaign Effects
Create Custom Machine Classifiers
Yes
No
No
Search is Fundamental for Purposive Sampling
Defined Search Speeds Up Discovery
Tumblr. – “The Wild West of the Internet”
Stupid Stuff People Do & Tweet
redacted
redacted
Are These Tweets Just Social Garbage?
redacted
redacted
Signs of Health Fear Engagement
redacted
redacted
An IdeaScreen Use Case
Concept Testing Data
Raw VoC Data: AFortune 500 Tech Company
Near Duplicate Clusters Can Be Interesting
Two Naturally Occurring Clusters of Free Text
Wherever Humans Go in Numbers, There Are Clusters
1st Wave of Human Coding Blazes a Trail
A„Simple‟ Coding Scheme with No Coder Training
Filtering Based on Classifier Scores
Testing Coder Agreement on a Small Sample
Measuring Inter-CoderAgreement
Validation of Coders & Codes
TextAnalytics is a Series Buckets & Datasets
Breaking Down Concerns by Subtype
Breaking Down Advocacy by Pro and Con
A New Vision Critical Front End
The First Preview of the New Release
The New VC Front End for DiscoverText
Coding Items to Train a Classifier
Leverage Item Metadata While Coding or Filtering
Code Items in a List View

Summit slide loop ny