Your SlideShare is downloading. ×
0
Pivotal Data Scientists on
the Front Line: Examples of
Data Science in Action
Getting to Know Your Customer with
Big & Fas...
Welcome – It’s a Pleasure to Meet You
• The Launch of Pivotal
• Pivotal Data Science Team
• Getting to Know Your Customer
...
A NEW PLATFORM FOR A NEW ERA
Pivotal, The New EMC Spin-out
 Pivotal is building
a new platform for a new era
 This platform enables customers
to buil...
Introducing the Pivotal Stack
Data-Driven
Application
Development

Pivotal Data
Science Labs

Cloud
Application
Platform
D...
Pivotal Services: Rapid Time to Value

Pivotal Labs:
Quickly create and
deploy new
applications
• Proven methodology to
re...
Pivotal Data Science

© Copyright 2013 EMC Corporation. All rights reserved.

7
Tell Me About Data Science
 What it is:
–
–
–
–
–

Data preparation
Data exploration and visualization
Feature creation b...
Platform-Driven Data Science Paradigm
Shift
1.
2.

Rapid ingestion of new data

3.

Re-use of valuable data

4.

Faster mo...
Pivotal Data Science Knowledge Development

© Copyright 2013 EMC Corporation. All rights reserved.

10
 Data Science Strategy

Pivotal Data
Science Labs

 Point Model
Development
 Multiple Model
Development
 Transformatio...
Getting to Know Your
Customer
Deeper Insights
With Data Science

© Copyright 2013 EMC Corporation. All rights reserved.

1...
More Data Science  Deeper Insights
Meet Your
Customer

Learn More
About Your
Customer

Adapt to
Your
Customer

Build
Mode...
The New Normal: “An Audience of One”
DATA DEVICES

Individuals
Analytic
Services

Employers
Advertising

Information
Broke...
Data-driven Customer Analytics
Data
Size

GB

Purchases
Customer
Data

TB

Transaction
History
Clickstream

PB

Social Med...
More Data Science  Deeper Insights
Meet Your
Customer

Learn More
About Your
Customer

Adapt to
Your
Customer

Build
Mode...
Who Are Our Customers?
• One way of learning about
customers is to divide them
into characteristic groups
• This is called...
What Did We Have to Work With?

Product Sales

Population Served
© Copyright 2013 EMC Corporation. All rights reserved.

C...
So What Did We
Do With this Data?
Before – Random Clusters
© Copyright 2013 EMC Corporation. All rights reserved.

After –...
What Was the Outcome?
New Clinics

© Copyright 2013 EMC Corporation. All rights reserved.

Neighborhood
Clinics

Pirate Cl...
Where to Next?
ConsumerProvider
Recommendation
Engine

Lifetime Value
Calculation

Churn Models

Segmentation
as
Foundatio...
Summary: Get to Know Your Customer by Building DataDriven Models
Objective:
•Improve understanding of customer
Data:
•Exis...
More Data Science  Deeper Insights
Meet Your
Customer

Learn More
About Your
Customer

Adapt to
Your
Customer

Build
Mode...
Churn Models for Telecom Industry
Goal

– Identify and prevent customers who are likely to churn.

Challenges
–
–
–
–

Cos...
Structured Features for Churn Models
The problem is extensively studied with a rich set of approaches in the
literature

D...
Blending the Unstructured with the
Structured
 What other sources of previously untapped data could we use ?

 Are our c...
Sentiment Analysis and Topic Models
BETTER PREDICT LIKELIHOOD
TO CHURN

Unstructured Data
External

Internal
Sentiment Ana...
Topic Clouds from Twitter - An Example
Baby shower & Coupons: 13%

Convenience:
26%

© Copyright 2013 EMC Corporation. All...
Summary: More Data to Drive Additional Customer Insights
Objective:
•Improve accuracy of churn models by blending structur...
More Data Science  Deeper Insights
Meet Your
Customer

Learn More
About Your
Customer

Adapt to
Your
Customer

Build
Mode...
State of Data at Telco Company
Customer Segments

Multi-Gadget Families

New Data Sources

Affluent Matures
Internet Deep ...
Understanding Subscriber Behavior
What is the level of engagement with
Client’s products (TV, VOD, Internet)?

Native Serv...
Newly Identified Behavior-Based
Segments
Moderates

iPhone Heavy

OTT & Data Heavyweights

Subscribers

Android Heavy

Por...
Going Further: Crossing Behavior-Based
Segments on Existing Customer Segments
Existing Segments

Newly Discovered Usage-Ba...
Driving New Business Value by Leveraging
Data Science

Upsell and Cross-Sell

© Copyright 2013 EMC Corporation. All rights...
Summary: Adapt to Your Customer with More Data Science
Objective:
•Combine existing models with new models derived from bi...
Let’s Get Started
Transforming Your Business with
Data Science

© Copyright 2013 EMC Corporation. All rights reserved.

38
Process in New
World Order

© Copyright 2013 EMC Corporation. All rights reserved.

39
Pivotal Data Science Labs: Packaged Services
LAB PRIMER

LAB 100

LAB 600

LAB 1200

• Analytics Roadmap

• On-site MPP
An...
Thank You
Do you have any questions?

© Copyright 2013 EMC Corporation. All rights reserved.

41
Pivotal Sessions at EMC World
Session

Presenter

Dates/Times

The Pivotal Platform: A Purpose-Built Platform for Big-Data...
Pivotal: Data Scientists on the Front Line: Examples of Data Science in Action
Upcoming SlideShare
Loading in...5
×

Pivotal: Data Scientists on the Front Line: Examples of Data Science in Action

1,883

Published on

A close-up examination of how data science is being used today to drive company and sector-level transformations. Reviewing architecture, business goals, data science methodology and tool-usage, and the path to operationalization. Multiple case-studies reveal how Data Science has delivered lasting value to the organization, while also paving the way for data to become a new source of competitive differentiation.


Objective 1: Learn how companies can become predictive rather than reacting to the past.
After this session you will be able to:
Objective 2: Understand why companies that employ data science strategies will be able to develop a competitive advantage.
Objective 3: Understand how companies can get started on their journey to become data-driven organizations.

Published in: Technology, Education
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,883
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
37
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Transcript of "Pivotal: Data Scientists on the Front Line: Examples of Data Science in Action"

  1. 1. Pivotal Data Scientists on the Front Line: Examples of Data Science in Action Getting to Know Your Customer with Big & Fast Data Pivotal Data Science Team © Copyright 2013 EMC Corporation. All rights reserved. 1
  2. 2. Welcome – It’s a Pleasure to Meet You • The Launch of Pivotal • Pivotal Data Science Team • Getting to Know Your Customer - Meet your customer: Build Models - Learn more about your customer: More Data - Adapt to your customer: Dynamic Models • Let’s Get Started: Pivotal Data Science Labs • Q&A © Copyright 2013 EMC Corporation. All rights reserved. 2
  3. 3. A NEW PLATFORM FOR A NEW ERA
  4. 4. Pivotal, The New EMC Spin-out  Pivotal is building a new platform for a new era  This platform enables customers to build a new class of applications  That leverage Big and Fast Data  All this with the power of cloud independence © Copyright 2013 EMC Corporation. All rights reserved. Private Cloud Public Cloud 4
  5. 5. Introducing the Pivotal Stack Data-Driven Application Development Pivotal Data Science Labs Cloud Application Platform Data & Analytics Platform Virtualization Cloud Storage © Copyright 2013 EMC Corporation. All rights reserved. 5
  6. 6. Pivotal Services: Rapid Time to Value Pivotal Labs: Quickly create and deploy new applications • Proven methodology to remove risk and accelerate results © Copyright 2013 EMC Corporation. All rights reserved. Pivotal Data Science Labs: Open Source Support: A proven data science practice to accelerate analytics projects Collaborative and customer-driven open source support, services and codevelopment • Drive business value through data analytics 6
  7. 7. Pivotal Data Science © Copyright 2013 EMC Corporation. All rights reserved. 7
  8. 8. Tell Me About Data Science  What it is: – – – – – Data preparation Data exploration and visualization Feature creation based on data and domain knowledge Quantitative modeling & model validation Scoring data  What it is not: – A set of tools – Application development © Copyright 2013 EMC Corporation. All rights reserved. 8
  9. 9. Platform-Driven Data Science Paradigm Shift 1. 2. Rapid ingestion of new data 3. Re-use of valuable data 4. Faster model building 5. Scalable advanced modeling 6. Faster model refreshing 7. © Copyright 2013 EMC Corporation. All rights reserved. Modeling on more data Faster data scoring 9
  10. 10. Pivotal Data Science Knowledge Development © Copyright 2013 EMC Corporation. All rights reserved. 10
  11. 11.  Data Science Strategy Pivotal Data Science Labs  Point Model Development  Multiple Model Development  Transformation to “Predictive Enterprise” © Copyright 2013 EMC Corporation. All rights reserved. 12
  12. 12. Getting to Know Your Customer Deeper Insights With Data Science © Copyright 2013 EMC Corporation. All rights reserved. 13
  13. 13. More Data Science  Deeper Insights Meet Your Customer Learn More About Your Customer Adapt to Your Customer Build Models More Data Dynamic Models © Copyright 2013 EMC Corporation. All rights reserved. 14
  14. 14. The New Normal: “An Audience of One” DATA DEVICES Individuals Analytic Services Employers Advertising Information Brokers AD AGENCY Marketers INTERNET Websites Data Users/Buyers RETAIL GOVERNMENT Data Aggregators Catalog Co-ops Media Media Archives Credit Bureaus PHONE/ TV List Brokers Government CONTENT Delivery Services Banks © Copyright 2013 EMC Corporation. All rights reserved. 15
  15. 15. Data-driven Customer Analytics Data Size GB Purchases Customer Data TB Transaction History Clickstream PB Social Media Analysis Targeting & Retention Campaign optimization Unified data supporting re-usable predictive models © Copyright 2013 EMC Corporation. All rights reserved. 16
  16. 16. More Data Science  Deeper Insights Meet Your Customer Learn More About Your Customer Adapt to Your Customer Build Models More Data Dynamic Models © Copyright 2013 EMC Corporation. All rights reserved. 17
  17. 17. Who Are Our Customers? • One way of learning about customers is to divide them into characteristic groups • This is called segmentation • Let’s take a look at a segmentation exercise Pivotal did with a large medical insurance company… © Copyright 2013 EMC Corporation. All rights reserved. 18
  18. 18. What Did We Have to Work With? Product Sales Population Served © Copyright 2013 EMC Corporation. All rights reserved. Claims Data Consumer Data Provider Information 19
  19. 19. So What Did We Do With this Data? Before – Random Clusters © Copyright 2013 EMC Corporation. All rights reserved. After – Cohesive Clusters 20
  20. 20. What Was the Outcome? New Clinics © Copyright 2013 EMC Corporation. All rights reserved. Neighborhood Clinics Pirate Clinics Established Clinics 21
  21. 21. Where to Next? ConsumerProvider Recommendation Engine Lifetime Value Calculation Churn Models Segmentation as Foundational Analytics Microsegmentation Marketing Mix Model Cross-Sell/UpSell Optimization © Copyright 2013 EMC Corporation. All rights reserved. 22
  22. 22. Summary: Get to Know Your Customer by Building DataDriven Models Objective: •Improve understanding of customer Data: •Existing EDW sources •New big data sources that capture customer demographics, such as the publicly available US Census Data Science Methodology: •Segmentation via k-means clustering Business Impact & Improvement: •Dramatically increase familiarity with makeup and behavior of customer base •Drive targeted marketing efforts •Lay foundation for higher-quality future models © Copyright 2013 EMC Corporation. All rights reserved. 23
  23. 23. More Data Science  Deeper Insights Meet Your Customer Learn More About Your Customer Adapt to Your Customer Build Models More Data Dynamic Models © Copyright 2013 EMC Corporation. All rights reserved. 24
  24. 24. Churn Models for Telecom Industry Goal – Identify and prevent customers who are likely to churn. Challenges – – – – Cost of acquiring new customers is high Recouping cost of customer acquisition high if customer is not retained long enough Lower barrier to switching subscribers With mobile number portability, barrier to switching even lower Good News – Cost of retaining existing customers is lower! © Copyright 2013 EMC Corporation. All rights reserved. 25
  25. 25. Structured Features for Churn Models The problem is extensively studied with a rich set of approaches in the literature Device Texting Stats Call Stats Rate Plans Customer Demographics  These features are great, but the models soon hit a plateau with structured features! © Copyright 2013 EMC Corporation. All rights reserved. 26
  26. 26. Blending the Unstructured with the Structured  What other sources of previously untapped data could we use ?  Are our customers happy ? Where ? What segments ?  What are the common topics in their conversations online ? © Copyright 2013 EMC Corporation. All rights reserved. 27
  27. 27. Sentiment Analysis and Topic Models BETTER PREDICT LIKELIHOOD TO CHURN Unstructured Data External Internal Sentiment Analysis Engine (Classifier) Topic Engine (LDA) Structured Data: EDW © Copyright 2013 EMC Corporation. All rights reserved. Topic Dashboard 28
  28. 28. Topic Clouds from Twitter - An Example Baby shower & Coupons: 13% Convenience: 26% © Copyright 2013 EMC Corporation. All rights reserved. Promotions, deals: 17% Misc: 32% Store experience: 13% 29
  29. 29. Summary: More Data to Drive Additional Customer Insights Objective: •Improve accuracy of churn models by blending structured features with unstructured text Data: •Existing structured features (call data records, device type, rate plans etc.) •Call center memos Data Science Methodology: •Sentiment Analysis and Topic Modeling Business Impact & Improvement: •Achieved 16% improvement in ROC curve for Churn prediction •Topic Models automatically identified common themes in call center memos •Laid foundation for Text Analytics © Copyright 2013 EMC Corporation. All rights reserved. 30
  30. 30. More Data Science  Deeper Insights Meet Your Customer Learn More About Your Customer Adapt to Your Customer Build Models More Data Dynamic Models © Copyright 2013 EMC Corporation. All rights reserved. 31
  31. 31. State of Data at Telco Company Customer Segments Multi-Gadget Families New Data Sources Affluent Matures Internet Deep Packet Inspection Thrifty Families TV Consumption (Linear) High Tech Singles Video On Demand Consumption Budget Singles © Copyright 2013 EMC Corporation. All rights reserved. Seniors 32
  32. 32. Understanding Subscriber Behavior What is the level of engagement with Client’s products (TV, VOD, Internet)? Native Services Internet Video On Demand TV Internet Devices What are the patterns of device usage behavior? What is the level of OTT engagement, by segment, and by bandwidth? © Copyright 2013 EMC Corporation. All rights reserved. OTT Services 33
  33. 33. Newly Identified Behavior-Based Segments Moderates iPhone Heavy OTT & Data Heavyweights Subscribers Android Heavy Portable OTT Entertainment Seekers iPad Heavy In-Home OTT Entertainment Seekers VOD Heavy In-Home Native Content Seekers TV Heavy © Copyright 2013 EMC Corporation. All rights reserved. 34
  34. 34. Going Further: Crossing Behavior-Based Segments on Existing Customer Segments Existing Segments Newly Discovered Usage-Based Segments Moderates Multi-Gadget Families OTT & Data Heavyweights Affluent Matures In-Home OTT Entertainment Seekers Thrifty Families Portable OTT Entertainment Seekers - iPhone Heavy High Tech Singles Portable OTT Entertainment Seekers - Android Heavy Budget Singles Portable OTT Entertainment Seekers - iPad Heavy Seniors In-Home Native Content Seekers - VOD Heavy In-Home Native Content Seekers - TV Heavy Customized Micro-Segments! © Copyright 2013 EMC Corporation. All rights reserved. 35
  35. 35. Driving New Business Value by Leveraging Data Science Upsell and Cross-Sell © Copyright 2013 EMC Corporation. All rights reserved. New Product Offerings Data Monetization 36
  36. 36. Summary: Adapt to Your Customer with More Data Science Objective: •Combine existing models with new models derived from big data sources Data: •Existing EDW sources •New big data sources that capture subscriber behavior, including machine generated sources such as DPI & VOD set-top box data Data Science Methodology: •Micro-segmentation via clustering Business Impact & Improvement: •Reduce operational and financial dependence on survey data •Lay foundation for data monetization •Generate tailored upsell & cross-sell opportunities •Real, customer behavior driven guidance for product & app development © Copyright 2013 EMC Corporation. All rights reserved. 37
  37. 37. Let’s Get Started Transforming Your Business with Data Science © Copyright 2013 EMC Corporation. All rights reserved. 38
  38. 38. Process in New World Order © Copyright 2013 EMC Corporation. All rights reserved. 39
  39. 39. Pivotal Data Science Labs: Packaged Services LAB PRIMER LAB 100 LAB 600 LAB 1200 • Analytics Roadmap • On-site MPP Analytics Training • Prof. services • Prof. services • Data science model building • Data science model building • Ready-to-deploy model(s) • Ready-to-deploy model(s) (2-Week Roadmapping) • Prioritized Opportunities • Architectural Recommendations (Analytics Bundle) • Analytics tool-kit • Quick insight (2 weeks) (6-Week Lab) (12-Week Lab) *Pivotal platform priced separately © Copyright 2013 EMC Corporation. All rights reserved. 40
  40. 40. Thank You Do you have any questions? © Copyright 2013 EMC Corporation. All rights reserved. 41
  41. 41. Pivotal Sessions at EMC World Session Presenter Dates/Times The Pivotal Platform: A Purpose-Built Platform for Big-DataDriven Applications Josh Klahr Tue 5:30 - 6:30, Palazzo E Wed 11:30 - 12:30, Delfino 4005 Pivotal: Data Scientists on the Front Line: Examples of Data Science in Action Noelle Sio Tue 10:00 - 11:00, Lando 4205 Thu 8:30 - 9:30, Palazzo F Pivotal: Operationalizing 1000-node Hadoop Cluster – Analytics Workbench Clinton Ooi Bhavin Modi Tue 11:30 - 12:30, Palazzo L Thu 10:00- 11:00 am, Delfino 4001A Pivotal: for Powerful Processing of Unstructured Data For Valuable Insights SK Krishnamurthy Mon 4:00 - 5:00, Lando 4201 A Tue 4:00 - 5:00, Palazzo M Pivotal: Big & Fast data – merging real-time data and deep analytics Michael Crutcher Mon 1:00 - 2:00, Lando 4201 A Wed 10:00 - 11:00, Palazzo M Pivotal: Virtualize Big Data to Make The Elephant Dance June Yang Dan Baskette Mon 11:30 - 12:30, Marcello 4401A Wed 4:00 - 5:00, Palazzo E Hadoop Design Patterns Don Miner Mon 2:30 - 3:30, Palazzo F Wed 8:30 - 9:30, Delfino 4005 © Copyright 2013 EMC Corporation. All rights reserved. 42
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×