Market Research Meets Big Data Analytics for Business Transformation

Al Nevarez
Senior Manager, Business Analytics
LinkedIn
Sally Sadosky
Group Manager, Market Research
LinkedIn
Market Research Meets Big Data Analytics
for Business Transformation
The Market Research Conference
Orlando, FL
Nov 2-4, 2015

Agenda
1. Linkedin’s Business
2. Market Research & Customer Feedback at LinkedIn
3. Market Research Big Data
4. Big Data: talent, tools & process at Linkedin for MR
5. Low cost per answer with modern ETL (Extract, Transform, Load)
6. The value is in the JOIN
7. Reporting
8. Analysis: Traditional & Modern techniques
9. The Big Picture

Create economic opportunity
for every member of the global
workforce
Vision

SCHOOLSCOMPANIES KNOWLEDGESKILLSMEMBERS JOBS
T H E E C O N O M I C G R A P H

Value Proposition: Connect to Opportunity
B2C
Business to Consumer
B2B
Business to Business
Market Research & Analytics are key
to bridge the gap

With your professional
world
Through professional
news and knowledge
And build your career
Connect Stay Informed Get Hired
For our members

Power the majority
of the world’s hires
Identify & engage
professionals with
relevant content
Social selling.
Transform cold
calls into warm
prospects
Hire Market Sell
Share content,
find, contact, and
learn more about
people at your
company
@Work
For our clients

At LinkedIn, we believe in:
1. Delivers on a singular value proposition in a world class way
2. Simple, intuitive and anticipates needs
3. Exceed expectations
4. Emotionally resonate
5. Change the user’s life for the better

Opportunity
Identification and
Exploration
Idea Generation
Concept Definition
Product Definition
User Experience
and Usability
Go To Market
Product Launch
Post Launch
Tracking and
Evaluation
Member
Empathy
Research and Analytics

NPS as a Measure of Loyalty
Post Launch
Tracking and
Evaluation
Member
Empathy
Opportunity
Identification and
Exploration
Idea Generation
Concept Definition
Product Definition
User Experience
and Usability
Go To Market
Product Launch
Post Launch
Tracking and
Evaluation

13
How likely are you to
recommend LinkedIn to a
friend or a colleague?
NPS

14
Area of Focus
Known to Self
Unknown to Others
Open
Hidden
Known to Linkedin Unknown to Linkedin
Known to Members
Unknown to Members
Discovery
Unknown

15
NPS captures both Heart and Mind

• 2000 completes per month per country
• Daily email sends
• Representative sample: # of visits per 90 days
• Members are kept anonymous
• Mobile ready
• In local language
• Results weighted by country
16
LinkedIn’s NPS and CSAT program
19
Top 9
Countries

Questionnaire Design
• Set a competitive context
• social networking, jobs sites, content
• NPS for each selected site
• Open-end about NPS rating
• CSAT product questions for LinkedIn
• Emotional driver questions for LinkedIn
• Open-end on what LinkedIn can do better
• Key demographics
• Re-contact permission ask
• Behavioral data appends (pre-prop)

Research Analysis Teams at Linkedin
1. Market research analysts
2. Business Analytics Data Scientists  Al

Talent
Solutions
Marketing
Solutions
100 team members support 9000+ employees
Sales
Solutions
Premium
Subscriptions
Consumer
Marketing
Business Analytics
Business Operations & Analytics
CFO
CEO
Where is Business Analytics
in Linkedin’s organization ?
Market
Research

Insights
What is the best
that could happen?
Intelligence
What will happen?
Information/Knowledge
Why did it happen?
Data
What happened?
Business ROI
Business analytics evolution: from data to transformation
Transformation & Change
Implement & monitor

Business models
Marketing, Sales, Recruiting
Targeting & Attribution
Customer experience
Communication/interpersonal skills
Statistics
Probability
Optimization
Modeling
Numerical analysis
Simulations
Analytics
A-B Testing
SQL, ETL, APIs,
relational database,
graph database,
software engineering,
tool building, web
applications, R, Python,
Data disualization,
data mining,
Machine Learning
Hadoop,
Spark, Hive, Pig
The business analytics staff - Complete Data Scientists
Business
Knowledge
Outcome = Data products
which many staff can leverage

Big Data Technical Themes
1. Efficient: Move the computation to the data
2. Shared foundation to build on with open source
3. Scalability (storage 1/10th the price of traditional)
4. Scalability (grow to multiple – thousands –
of processors with little cost)
5. Reliability (replicated data, failure survival)
6. Schema on read (save all data in raw form, NoSQL)

Components of Hadoop
3 areas
1. Data Storage HDFS: a network OS for the data, replication
2. Map reduce: Efficiently spreads the work
3. Hadoop libraries: Hive, Hbase, Pig….

Big Data Query & Analysis Tools
Hadoop

Big Data Tools We Use Regularly at
Hadoop
Hive
Pig
Low cost storage
Unstructured data
Highly scalable processing
SQL-like query
Query Hadoop data
Massive result sets
Advanced processing
Advanced ETL
Data Flows

Map Reduce
Example: average a billion #s
Distribute to 1000 nodes > Get sum & count at each node >
Sum the sums and sum the counts > at end sum of sums / total counts

Survey
Vendor
DATA
EXTRACTION
DATA
TRANSFORMATION
DATA
VISUALIZATION
Our NPS survey response ETL Process Overview
API

Big Data’s Value for Linkedin
Low cost storage
+
Schema-less storage
+
Easy for Data
Warehouse team
= Lower cost per answer

Sampling from the Data Warehouse

Sampling Data Workflow for Survey Research
Members &
Clients use:
Flagship Desktop
Mobile Apps
Talent solutions
Marketing solutions
Sales solutions
Application
Data storage
(Engineering)
ETL to DWH
(Data Services)
400mil members
• Sign ins
• Profile edits
• Language setting
• Product registrations
• Searches
• Publishing
Profile summaries
Aggregated data
Usage & Engagement
levels (daily visits)
Member segments
Survey history
Survey pre-pop data
Sample for
non-survey studies
Sample for
survey studies
SQL processes
Automated, some manual
Global
Daily, monthly or quarterly
Sampling strategy adjustments
Survey pre-pop data
Snapshot tables
SQL
(Marketing Operations)
Survey vendor
Snapshot

Some member data is anonymously passed (or obfuscated and
passed) to the survey vendor with the invitation list to support:
1. Survey branching
2. Survey quota management
3. Survey language
4. Light reporting on survey vendor’s reporting platform
Pass through or pre-pop
Field count: dozen or so

In addition to pre-pop data passed to the survey vendor,
internally we store “snapshot” values about each survey invitee.
1. Maintains a snapshot of the member’s full profile at the time
of survey
2. Private & internal to Linkedin
3. Used for internal NPS (general BI) analysis & dashboards
4. Used for data mining & pattern discovery
5. Used by many departments to understand members/clients’
activity at time of survey
6. Slice and dice by anything that comes up
7. Key = member id
Snapshot Profile Data
Field count: Hundreds

ETL Process for Low Cost Per Answer
from your survey results

ETL Process Before Big Data
Survey Vendor Data
Survey program A
Survey program B
Survey program C
Survey program D
Survey program E
Multiple Relational
Database Tables
Survey Table A
Survey Table B
Survey Table C
Survey Table D
Survey Table E
What if Survey B adds 5 questions and drops 3 questions ?
$ $ $ $
Schema A
Schema B
Schema C
Schema D
Schema E

ETL Process After Big Data
Survey Vendor
Survey program A
Survey program B
Survey program C
Survey program D
Survey program E
1 Simple relational
database table
… with just the data
we need for analysis and
dashboards
But ALL the data fully
available on Hadoop
for other studies
$
$
Schema
HDFS

Survey document
storage on HDFS
Record 1:
{
"record" : 8695,
"uuid" : "zzcxgtz2m0ahuzf2",
"date" : 1434475680000,
"start_date" : 1434475020000,
"customer_id" : "abd123",
”survey_fields" : {
"Q1_NPS" : "10",
"Q6_Driversr1" : "11",
"Q7_Productsatr1" : "8",
"wave" : 1,
"country" : 1,
"is_mobile" : 1,
"mobileos" : 3
"verbatim1": "Love Linkedin!"
"status" : 3
}
}
Schema
An example survey record (condensed)
Core key values are those that exists for
every survey record.
Under “survey_fields” we have the
survey specific fields.
DWH team only stores this.
The may be very different between
survey programs, and may change
for a given survey program. DWH
team doesn’t care.

Example PIG script to read from HDFS
survey_raw = LOAD '/data/external/survey_vendor/survey_program1/
survey_step1 = FILTER survey_raw BY survey_fields#'status' == '3';
survey_step2 = FOREACH survey_step1 GENERATE
(charArray) ‘survey_program1' AS suvey_program_id,
(charArray) uuid AS unique_response_id,
(charArray) id AS member_id,
(int) survey_fields#'vwave' as wave_field,
(int) survey_fields#'Q1_NPS' AS nps_value,
(charArray) survey_fields#'verbatim1' AS reason,
(int) survey_fields#'Q6_Drivers1',
(int) survey_fields#'Q7_Product_csat1',
(int) survey_fields#'V7_Product_csat2',
(int) survey_fields#'V7_Product_csat3',
(int) additionalinfo#'mobileos',
STORE survey_step2 INTO 'survey_nps' USING PigStorage('t');
Upload
To Teradata

Why is all this important? Because..
The Power is in the SQL JOIN
(and letting others join too)
select NPS_value, behavior1, behavior2
from nps_data a
inner join behavior1_data b
on a.customer_id = b.customer_id
inner join behavior2_data c
on a.customer_id = c.customer_id
NPS Data Behavior
1 Data
Behavior
2 Data

• What’s the NPS for each of our
member audience segments?
• What’s the NPS of members who
received our recent marketing
campaign and took action on it?
• What’s the NPS of software engineers
who have at least 5 skills, each with
more than 10 endorsements on their
profile?
Connect Stay Informed Get Hired
The JOIN allows us to answer questions in context of
business needs and customer experience
• What’s the satisfaction with our new
messaging tool for members who had
it enabled?
• What’s the NPS by region for
members who have purchased our
premium subscriptions?
• What’s the CRM record for B2B
customers who took our NPS survey?
• Which members scored highly on both
our member survey and our Talent
solutions survey?

Our NPS monitoring tool at Linkedin

Big Data Trends 2014
1. Uploadable, findable, shareable, real-time data
2. Sensors use rising rapidly.
3. Processing costs falling rapidly, while cloud rises
4. Beautiful new user interfaces, aided by data-generating
consumers – helping make data usable/useful
5. Data mining / analytics tools improving & helping
find patterns
6. Early emergence of data/pattern driven problem
solving

Data Mining or
Machine Learning Outcomes
1. Rank or prioritize a customer or prospect list
2. Replace or move assets or resources
3. Classify or segment
4. Rank drivers of a key metric
5. Categorize text
6. Generate a lift for a key metric
Why not: NPS, Promoters, CSAT ?

Data Mining Techniques
Commonly Used by the
Business Analytics Team on Market
Research & other Marketing data
• Decision Trees & Random Forest
• Generalized Boosted Models (GBM)
• Logistic Regression
• Stochastic Gradient Descent(SGD)
• Clustering
• Bayesian Networks
• Text Classification & Mining (LDA, NLP)

LowHigh
Low High
54
Quad Chart: Importance vs. Performance
Invest & Improve
Monitor
Driver 1
Importance
Performance
Maintain & Leverage
Assess needs
Driver 2
Driver 3
Driver 4
Driver 5

Tools for Provoking & Taking Action
56
1. Always-available NPS and CSAT Dashboards for anyone,
for any product line
2. Drill down analysis
3. Emotional driver prioritization
4. Product driver prioritization
5. Open ends or verbatims
6. Composition & waterfall analysis for studying changes
7. Deep pattern analysis and focus

The Big Picture on Why Big Data Matters to Market Research
Business
Knowledge
Market
Research

CustomersProduct
Market Research

Moore’s Law

We are hiring!
Linkedin Job Search on:
Linkedin Business Analytics
Market Research
Transform yourself
Transform the company
Transform the world
Our vision is to create economic opportunity
for every member of the global workforce.
Thank you from
Al Nevarez
Sally Sadosky

Market Research Meets Big Data Analytics for Business Transformation

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Market Research Meets Big Data Analytics for Business Transformation

Similar to Market Research Meets Big Data Analytics for Business Transformation (20)

Market Research Meets Big Data Analytics for Business Transformation

Editor's Notes