• Share
  • Email
  • Embed
  • Like
  • Private Content
Pivotal: Data Scientists on the Front Line: Examples of Data Science in Action
 

Pivotal: Data Scientists on the Front Line: Examples of Data Science in Action

on

  • 654 views

A close-up examination of how data science is being used today to drive company and sector-level transformations. Reviewing architecture, business goals, data science methodology and tool-usage, and ...

A close-up examination of how data science is being used today to drive company and sector-level transformations. Reviewing architecture, business goals, data science methodology and tool-usage, and the path to operationalization. Multiple case-studies reveal how Data Science has delivered lasting value to the organization, while also paving the way for data to become a new source of competitive differentiation.


Objective 1: Learn how companies can become predictive rather than reacting to the past.
After this session you will be able to:
Objective 2: Understand why companies that employ data science strategies will be able to develop a competitive advantage.
Objective 3: Understand how companies can get started on their journey to become data-driven organizations.

Statistics

Views

Total Views
654
Views on SlideShare
654
Embed Views
0

Actions

Likes
1
Downloads
23
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Pivotal: Data Scientists on the Front Line: Examples of Data Science in Action Pivotal: Data Scientists on the Front Line: Examples of Data Science in Action Presentation Transcript

    • Pivotal Data Scientists on the Front Line: Examples of Data Science in Action Getting to Know Your Customer with Big & Fast Data Pivotal Data Science Team © Copyright 2013 EMC Corporation. All rights reserved. 1
    • Welcome – It’s a Pleasure to Meet You • The Launch of Pivotal • Pivotal Data Science Team • Getting to Know Your Customer - Meet your customer: Build Models - Learn more about your customer: More Data - Adapt to your customer: Dynamic Models • Let’s Get Started: Pivotal Data Science Labs • Q&A © Copyright 2013 EMC Corporation. All rights reserved. 2
    • A NEW PLATFORM FOR A NEW ERA
    • Pivotal, The New EMC Spin-out  Pivotal is building a new platform for a new era  This platform enables customers to build a new class of applications  That leverage Big and Fast Data  All this with the power of cloud independence © Copyright 2013 EMC Corporation. All rights reserved. Private Cloud Public Cloud 4
    • Introducing the Pivotal Stack Data-Driven Application Development Pivotal Data Science Labs Cloud Application Platform Data & Analytics Platform Virtualization Cloud Storage © Copyright 2013 EMC Corporation. All rights reserved. 5
    • Pivotal Services: Rapid Time to Value Pivotal Labs: Quickly create and deploy new applications • Proven methodology to remove risk and accelerate results © Copyright 2013 EMC Corporation. All rights reserved. Pivotal Data Science Labs: Open Source Support: A proven data science practice to accelerate analytics projects Collaborative and customer-driven open source support, services and codevelopment • Drive business value through data analytics 6
    • Pivotal Data Science © Copyright 2013 EMC Corporation. All rights reserved. 7
    • Tell Me About Data Science  What it is: – – – – – Data preparation Data exploration and visualization Feature creation based on data and domain knowledge Quantitative modeling & model validation Scoring data  What it is not: – A set of tools – Application development © Copyright 2013 EMC Corporation. All rights reserved. 8
    • Platform-Driven Data Science Paradigm Shift 1. 2. Rapid ingestion of new data 3. Re-use of valuable data 4. Faster model building 5. Scalable advanced modeling 6. Faster model refreshing 7. © Copyright 2013 EMC Corporation. All rights reserved. Modeling on more data Faster data scoring 9
    • Pivotal Data Science Knowledge Development © Copyright 2013 EMC Corporation. All rights reserved. 10
    •  Data Science Strategy Pivotal Data Science Labs  Point Model Development  Multiple Model Development  Transformation to “Predictive Enterprise” © Copyright 2013 EMC Corporation. All rights reserved. 12
    • Getting to Know Your Customer Deeper Insights With Data Science © Copyright 2013 EMC Corporation. All rights reserved. 13
    • More Data Science  Deeper Insights Meet Your Customer Learn More About Your Customer Adapt to Your Customer Build Models More Data Dynamic Models © Copyright 2013 EMC Corporation. All rights reserved. 14
    • The New Normal: “An Audience of One” DATA DEVICES Individuals Analytic Services Employers Advertising Information Brokers AD AGENCY Marketers INTERNET Websites Data Users/Buyers RETAIL GOVERNMENT Data Aggregators Catalog Co-ops Media Media Archives Credit Bureaus PHONE/ TV List Brokers Government CONTENT Delivery Services Banks © Copyright 2013 EMC Corporation. All rights reserved. 15
    • Data-driven Customer Analytics Data Size GB Purchases Customer Data TB Transaction History Clickstream PB Social Media Analysis Targeting & Retention Campaign optimization Unified data supporting re-usable predictive models © Copyright 2013 EMC Corporation. All rights reserved. 16
    • More Data Science  Deeper Insights Meet Your Customer Learn More About Your Customer Adapt to Your Customer Build Models More Data Dynamic Models © Copyright 2013 EMC Corporation. All rights reserved. 17
    • Who Are Our Customers? • One way of learning about customers is to divide them into characteristic groups • This is called segmentation • Let’s take a look at a segmentation exercise Pivotal did with a large medical insurance company… © Copyright 2013 EMC Corporation. All rights reserved. 18
    • What Did We Have to Work With? Product Sales Population Served © Copyright 2013 EMC Corporation. All rights reserved. Claims Data Consumer Data Provider Information 19
    • So What Did We Do With this Data? Before – Random Clusters © Copyright 2013 EMC Corporation. All rights reserved. After – Cohesive Clusters 20
    • What Was the Outcome? New Clinics © Copyright 2013 EMC Corporation. All rights reserved. Neighborhood Clinics Pirate Clinics Established Clinics 21
    • Where to Next? ConsumerProvider Recommendation Engine Lifetime Value Calculation Churn Models Segmentation as Foundational Analytics Microsegmentation Marketing Mix Model Cross-Sell/UpSell Optimization © Copyright 2013 EMC Corporation. All rights reserved. 22
    • Summary: Get to Know Your Customer by Building DataDriven Models Objective: •Improve understanding of customer Data: •Existing EDW sources •New big data sources that capture customer demographics, such as the publicly available US Census Data Science Methodology: •Segmentation via k-means clustering Business Impact & Improvement: •Dramatically increase familiarity with makeup and behavior of customer base •Drive targeted marketing efforts •Lay foundation for higher-quality future models © Copyright 2013 EMC Corporation. All rights reserved. 23
    • More Data Science  Deeper Insights Meet Your Customer Learn More About Your Customer Adapt to Your Customer Build Models More Data Dynamic Models © Copyright 2013 EMC Corporation. All rights reserved. 24
    • Churn Models for Telecom Industry Goal – Identify and prevent customers who are likely to churn. Challenges – – – – Cost of acquiring new customers is high Recouping cost of customer acquisition high if customer is not retained long enough Lower barrier to switching subscribers With mobile number portability, barrier to switching even lower Good News – Cost of retaining existing customers is lower! © Copyright 2013 EMC Corporation. All rights reserved. 25
    • Structured Features for Churn Models The problem is extensively studied with a rich set of approaches in the literature Device Texting Stats Call Stats Rate Plans Customer Demographics  These features are great, but the models soon hit a plateau with structured features! © Copyright 2013 EMC Corporation. All rights reserved. 26
    • Blending the Unstructured with the Structured  What other sources of previously untapped data could we use ?  Are our customers happy ? Where ? What segments ?  What are the common topics in their conversations online ? © Copyright 2013 EMC Corporation. All rights reserved. 27
    • Sentiment Analysis and Topic Models BETTER PREDICT LIKELIHOOD TO CHURN Unstructured Data External Internal Sentiment Analysis Engine (Classifier) Topic Engine (LDA) Structured Data: EDW © Copyright 2013 EMC Corporation. All rights reserved. Topic Dashboard 28
    • Topic Clouds from Twitter - An Example Baby shower & Coupons: 13% Convenience: 26% © Copyright 2013 EMC Corporation. All rights reserved. Promotions, deals: 17% Misc: 32% Store experience: 13% 29
    • Summary: More Data to Drive Additional Customer Insights Objective: •Improve accuracy of churn models by blending structured features with unstructured text Data: •Existing structured features (call data records, device type, rate plans etc.) •Call center memos Data Science Methodology: •Sentiment Analysis and Topic Modeling Business Impact & Improvement: •Achieved 16% improvement in ROC curve for Churn prediction •Topic Models automatically identified common themes in call center memos •Laid foundation for Text Analytics © Copyright 2013 EMC Corporation. All rights reserved. 30
    • More Data Science  Deeper Insights Meet Your Customer Learn More About Your Customer Adapt to Your Customer Build Models More Data Dynamic Models © Copyright 2013 EMC Corporation. All rights reserved. 31
    • State of Data at Telco Company Customer Segments Multi-Gadget Families New Data Sources Affluent Matures Internet Deep Packet Inspection Thrifty Families TV Consumption (Linear) High Tech Singles Video On Demand Consumption Budget Singles © Copyright 2013 EMC Corporation. All rights reserved. Seniors 32
    • Understanding Subscriber Behavior What is the level of engagement with Client’s products (TV, VOD, Internet)? Native Services Internet Video On Demand TV Internet Devices What are the patterns of device usage behavior? What is the level of OTT engagement, by segment, and by bandwidth? © Copyright 2013 EMC Corporation. All rights reserved. OTT Services 33
    • Newly Identified Behavior-Based Segments Moderates iPhone Heavy OTT & Data Heavyweights Subscribers Android Heavy Portable OTT Entertainment Seekers iPad Heavy In-Home OTT Entertainment Seekers VOD Heavy In-Home Native Content Seekers TV Heavy © Copyright 2013 EMC Corporation. All rights reserved. 34
    • Going Further: Crossing Behavior-Based Segments on Existing Customer Segments Existing Segments Newly Discovered Usage-Based Segments Moderates Multi-Gadget Families OTT & Data Heavyweights Affluent Matures In-Home OTT Entertainment Seekers Thrifty Families Portable OTT Entertainment Seekers - iPhone Heavy High Tech Singles Portable OTT Entertainment Seekers - Android Heavy Budget Singles Portable OTT Entertainment Seekers - iPad Heavy Seniors In-Home Native Content Seekers - VOD Heavy In-Home Native Content Seekers - TV Heavy Customized Micro-Segments! © Copyright 2013 EMC Corporation. All rights reserved. 35
    • Driving New Business Value by Leveraging Data Science Upsell and Cross-Sell © Copyright 2013 EMC Corporation. All rights reserved. New Product Offerings Data Monetization 36
    • Summary: Adapt to Your Customer with More Data Science Objective: •Combine existing models with new models derived from big data sources Data: •Existing EDW sources •New big data sources that capture subscriber behavior, including machine generated sources such as DPI & VOD set-top box data Data Science Methodology: •Micro-segmentation via clustering Business Impact & Improvement: •Reduce operational and financial dependence on survey data •Lay foundation for data monetization •Generate tailored upsell & cross-sell opportunities •Real, customer behavior driven guidance for product & app development © Copyright 2013 EMC Corporation. All rights reserved. 37
    • Let’s Get Started Transforming Your Business with Data Science © Copyright 2013 EMC Corporation. All rights reserved. 38
    • Process in New World Order © Copyright 2013 EMC Corporation. All rights reserved. 39
    • Pivotal Data Science Labs: Packaged Services LAB PRIMER LAB 100 LAB 600 LAB 1200 • Analytics Roadmap • On-site MPP Analytics Training • Prof. services • Prof. services • Data science model building • Data science model building • Ready-to-deploy model(s) • Ready-to-deploy model(s) (2-Week Roadmapping) • Prioritized Opportunities • Architectural Recommendations (Analytics Bundle) • Analytics tool-kit • Quick insight (2 weeks) (6-Week Lab) (12-Week Lab) *Pivotal platform priced separately © Copyright 2013 EMC Corporation. All rights reserved. 40
    • Thank You Do you have any questions? © Copyright 2013 EMC Corporation. All rights reserved. 41
    • Pivotal Sessions at EMC World Session Presenter Dates/Times The Pivotal Platform: A Purpose-Built Platform for Big-DataDriven Applications Josh Klahr Tue 5:30 - 6:30, Palazzo E Wed 11:30 - 12:30, Delfino 4005 Pivotal: Data Scientists on the Front Line: Examples of Data Science in Action Noelle Sio Tue 10:00 - 11:00, Lando 4205 Thu 8:30 - 9:30, Palazzo F Pivotal: Operationalizing 1000-node Hadoop Cluster – Analytics Workbench Clinton Ooi Bhavin Modi Tue 11:30 - 12:30, Palazzo L Thu 10:00- 11:00 am, Delfino 4001A Pivotal: for Powerful Processing of Unstructured Data For Valuable Insights SK Krishnamurthy Mon 4:00 - 5:00, Lando 4201 A Tue 4:00 - 5:00, Palazzo M Pivotal: Big & Fast data – merging real-time data and deep analytics Michael Crutcher Mon 1:00 - 2:00, Lando 4201 A Wed 10:00 - 11:00, Palazzo M Pivotal: Virtualize Big Data to Make The Elephant Dance June Yang Dan Baskette Mon 11:30 - 12:30, Marcello 4401A Wed 4:00 - 5:00, Palazzo E Hadoop Design Patterns Don Miner Mon 2:30 - 3:30, Palazzo F Wed 8:30 - 9:30, Delfino 4005 © Copyright 2013 EMC Corporation. All rights reserved. 42