SlideShare a Scribd company logo
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Data Science in Action
Ji Li
Data Scientist
March 25, 2015
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Overview
Table of Contents
1 What is Data Science
Overview
Data Science Workflow
2 Churn Model
Business Understanding
Demo in R
Top Features
Prescriptive Analysis
3 Yesware Email Analysis
Email Analysis
4 Data Science in the Industry
Data Science Toolbox
Big Data
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Overview
Data science on Google search
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Overview
Data science from my point of view
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Data Science Workflow
Table of Contents
1 What is Data Science
Overview
Data Science Workflow
2 Churn Model
Business Understanding
Demo in R
Top Features
Prescriptive Analysis
3 Yesware Email Analysis
Email Analysis
4 Data Science in the Industry
Data Science Toolbox
Big Data
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Data Science Workflow
Domain expertise guides the workflow
Data Science Workflow
Data Munging
Data Mining
Delivery of actionable Insights
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Data Science Workflow
Data Munging
Data Munging
Data Munging means some or all
of the following tasks:
ETL
Data Integration
Data Cleansing
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Data Science Workflow
Data Munging
Data Munging
Data Munging means some or all
of the following tasks:
ETL
Data Integration
Data Cleansing
ETL
The process of extract,
transform, and load data.
To acquire data from
external sources.
To migrate multiple data
sources internally.
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Data Science Workflow
Data Munging
Data Munging
Data Munging means some or all
of the following tasks:
ETL
Data Integration
Data Cleansing
Data Integration
To combine data from disparate
sources into meaningful and
valuable information.
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Data Science Workflow
Data Munging
Data Munging
Data Munging means some or all
of the following tasks:
ETL
Data Integration
Data Cleansing
Data Cleansing
Data cleansing, also called data
scrubbing, is the process of
amending or removing data in a
database that is incorrect,
incomplete, improperly
formatted, or duplicated.
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Data Science Workflow
Data Mining
Data Mining
Data Mining is the key step to
turn data into insights:
Data Exploration
Machine Learning
Model Evaluation
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Data Science Workflow
Data Mining
Data Mining
Data Mining is the key step to
turn data into insights:
Data Exploration
Machine Learning
Model Evaluation
Data Exploration
The process of visually examine
and explore the data.
To gain basic understanding
of the data.
To identify relationships
between different attributes.
To answer basic questions
using data.
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Data Science Workflow
Data Mining
Data Mining
Data Mining is the key step to
turn data into insights:
Data Exploration
Machine Learning
Model Evaluation
Machine Learning
To obtain statistical models, we
usually need to go through
multiple steps like the following:
1 Construct new features.
2 Remove redundant features.
3 Choose one or more suitable
machine learning algorithm.
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Data Science Workflow
Data Mining
Data Mining
Data Mining is the key step to
turn data into insights:
Data Exploration
Machine Learning
Model Evaluation
Model Evaluation
Model evaluation is often used
not only to select the best model
from the set of models, but also
to get ready for producing
actionable insights.
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Data Science Workflow
Skills of a data scientist
Data Science skills
Programming
Statistics
Visualization
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Data Science Workflow
Doing data science
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Business Understanding
Table of Contents
1 What is Data Science
Overview
Data Science Workflow
2 Churn Model
Business Understanding
Demo in R
Top Features
Prescriptive Analysis
3 Yesware Email Analysis
Email Analysis
4 Data Science in the Industry
Data Science Toolbox
Big Data
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Business Understanding
The Goal
Goal
To predict which customers will churn.
To find ways to prevent customers from churning.
Workflow
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Business Understanding
Customer renewal data
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Business Understanding
Customer service call data
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Business Understanding
Customer plan type and demographic
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Business Understanding
Customer usage data
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Business Understanding
Churn data structure
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Demo in R
Table of Contents
1 What is Data Science
Overview
Data Science Workflow
2 Churn Model
Business Understanding
Demo in R
Top Features
Prescriptive Analysis
3 Yesware Email Analysis
Email Analysis
4 Data Science in the Industry
Data Science Toolbox
Big Data
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Demo in R
R Script
http://goo.gl/IV3HDs
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Top Features
Table of Contents
1 What is Data Science
Overview
Data Science Workflow
2 Churn Model
Business Understanding
Demo in R
Top Features
Prescriptive Analysis
3 Yesware Email Analysis
Email Analysis
4 Data Science in the Industry
Data Science Toolbox
Big Data
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Top Features
Variable importance from random forest
Top Features
cust surv calls
is intl plan
day mins
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Top Features
Top feature - customer service calls
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Top Features
Top feature - customer service calls
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Top Features
Top feature - customer day time minutes
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Top Features
Top feature - international plan and international calls
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Prescriptive Analysis
Table of Contents
1 What is Data Science
Overview
Data Science Workflow
2 Churn Model
Business Understanding
Demo in R
Top Features
Prescriptive Analysis
3 Yesware Email Analysis
Email Analysis
4 Data Science in the Industry
Data Science Toolbox
Big Data
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Prescriptive Analysis
Churn analysis
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Prescriptive Analysis
Churn analysis
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Email Analysis
Table of Contents
1 What is Data Science
Overview
Data Science Workflow
2 Churn Model
Business Understanding
Demo in R
Top Features
Prescriptive Analysis
3 Yesware Email Analysis
Email Analysis
4 Data Science in the Industry
Data Science Toolbox
Big Data
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Email Analysis
When to send an email
To Optimize Email Reply
Yesware users want to know how to get more replies.
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Email Analysis
When to send an email
To Optimize Email Reply
Yesware users want to know how to get more replies.
An Email Reply Model
1 Construct features from the email data.
2 Create a model to predict the reply on each email.
3 Identify some top features that contributed most to the reply:
Sent Hour
Sent Weekday
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Email Analysis
When to send an email
To Optimize Email Reply
Yesware users want to know how to get more replies.
Emails Sent Volume and Reply Rate by Sent Hour
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Email Analysis
When to send an email
To Optimize Email Reply
Yesware users want to know how to get more replies.
Emails Sent Volume and Reply Rate by Sent Hour
Emails Sent Volume and Reply Rate by Sent Weekday
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Email Analysis
D3 Visualization: Subject Line Keywords
Subject line keywords: http://goo.gl/PK9xhO
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Data Science Toolbox
Table of Contents
1 What is Data Science
Overview
Data Science Workflow
2 Churn Model
Business Understanding
Demo in R
Top Features
Prescriptive Analysis
3 Yesware Email Analysis
Email Analysis
4 Data Science in the Industry
Data Science Toolbox
Big Data
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Data Science Toolbox
Why do we need a toolbox?
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Data Science Toolbox
Tools that helped me do data science fast
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Data Science Toolbox
Data visualization tools
D3: http://d3js.org/
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Big Data
Table of Contents
1 What is Data Science
Overview
Data Science Workflow
2 Churn Model
Business Understanding
Demo in R
Top Features
Prescriptive Analysis
3 Yesware Email Analysis
Email Analysis
4 Data Science in the Industry
Data Science Toolbox
Big Data
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Big Data
Big Data
Big Data Ecosystem
Hadoop - file system
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Big Data
Big Data
Big Data Ecosystem
Hadoop - file system
Spark - computing system
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Big Data
Big Data
Big Data Ecosystem
Hadoop - file system
Spark - computing system
Spark Stack
https://spark.apache.org/
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Big Data
Big Data
Big Data Ecosystem
Hadoop - file system
Spark - computing system
Spark Stack
Spark SQL - Data Munging
Spark Streaming - Real
Time Processing
MLlib - Machine Learning
GraphX - Visualization
https://spark.apache.org/
Data Science In Action Ji Li
What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry
Big Data
Thank You
Please send your questions and feedback to me!
Data Science In Action Ji Li

More Related Content

What's hot

Predictive analytics: hot and getting hotter
Predictive analytics: hot and getting hotterPredictive analytics: hot and getting hotter
Predictive analytics: hot and getting hotter
The Marketing Distillery
 
Répondre à la question automatique avec le web
Répondre à la question automatique avec le webRépondre à la question automatique avec le web
Répondre à la question automatique avec le web
Ahmed Hammami
 
Best practices machine learning final
Best practices machine learning finalBest practices machine learning final
Best practices machine learning final
Dianna Doan
 

What's hot (20)

Popular Text Analytics Algorithms
Popular Text Analytics AlgorithmsPopular Text Analytics Algorithms
Popular Text Analytics Algorithms
 
27 ijcse-01238-5 sivaranjani
27 ijcse-01238-5 sivaranjani27 ijcse-01238-5 sivaranjani
27 ijcse-01238-5 sivaranjani
 
Interleaving, Evaluation to Self-learning Search @904Labs
Interleaving, Evaluation to Self-learning Search @904LabsInterleaving, Evaluation to Self-learning Search @904Labs
Interleaving, Evaluation to Self-learning Search @904Labs
 
professional fuzzy type-ahead rummage around in xml type-ahead search techni...
professional fuzzy type-ahead rummage around in xml  type-ahead search techni...professional fuzzy type-ahead rummage around in xml  type-ahead search techni...
professional fuzzy type-ahead rummage around in xml type-ahead search techni...
 
Survey Report: Results of a Survey on Microsoft Office 365
Survey Report: Results of a Survey on Microsoft Office 365Survey Report: Results of a Survey on Microsoft Office 365
Survey Report: Results of a Survey on Microsoft Office 365
 
Predictive analytics: hot and getting hotter
Predictive analytics: hot and getting hotterPredictive analytics: hot and getting hotter
Predictive analytics: hot and getting hotter
 
Introduction to Data Science and Analytics
Introduction to Data Science and AnalyticsIntroduction to Data Science and Analytics
Introduction to Data Science and Analytics
 
2016 Data Science Salary Survey
2016 Data Science Salary Survey2016 Data Science Salary Survey
2016 Data Science Salary Survey
 
Tips and Tricks to be an Effective Data Scientist
Tips and Tricks to be an Effective Data ScientistTips and Tricks to be an Effective Data Scientist
Tips and Tricks to be an Effective Data Scientist
 
O’reilly media 2014 data-science-salary-survey
O’reilly media 2014 data-science-salary-surveyO’reilly media 2014 data-science-salary-survey
O’reilly media 2014 data-science-salary-survey
 
Haystack keynote 2019: What is Search Relevance? - Max Irwin
Haystack keynote 2019: What is Search Relevance? - Max IrwinHaystack keynote 2019: What is Search Relevance? - Max Irwin
Haystack keynote 2019: What is Search Relevance? - Max Irwin
 
Répondre à la question automatique avec le web
Répondre à la question automatique avec le webRépondre à la question automatique avec le web
Répondre à la question automatique avec le web
 
IRJET- Review on Information Retrieval for Desktop Search Engine
IRJET-  	  Review on Information Retrieval for Desktop Search EngineIRJET-  	  Review on Information Retrieval for Desktop Search Engine
IRJET- Review on Information Retrieval for Desktop Search Engine
 
Strategies for Practical Active Learning, Robert Munro
Strategies for Practical Active Learning, Robert MunroStrategies for Practical Active Learning, Robert Munro
Strategies for Practical Active Learning, Robert Munro
 
Best practices machine learning final
Best practices machine learning finalBest practices machine learning final
Best practices machine learning final
 
Graduation Thesis Sample
Graduation Thesis SampleGraduation Thesis Sample
Graduation Thesis Sample
 
META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING
META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSINGMETA DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING
META DATA QUALITY CONTROL ARCHITECTURE IN DATA WAREHOUSING
 
Empirical discovery concept model
Empirical discovery concept modelEmpirical discovery concept model
Empirical discovery concept model
 
What does it_takes_to_be_a_good_data_scientist_2019_aim_simplilearn
What does it_takes_to_be_a_good_data_scientist_2019_aim_simplilearnWhat does it_takes_to_be_a_good_data_scientist_2019_aim_simplilearn
What does it_takes_to_be_a_good_data_scientist_2019_aim_simplilearn
 
Construction and Querying of Dynamic Knowledge Graphs
Construction and Querying of Dynamic Knowledge GraphsConstruction and Querying of Dynamic Knowledge Graphs
Construction and Querying of Dynamic Knowledge Graphs
 

Viewers also liked

Viewers also liked (20)

Customer attrition and churn modeling
Customer attrition and churn modelingCustomer attrition and churn modeling
Customer attrition and churn modeling
 
Churn model for telecom
Churn model for telecomChurn model for telecom
Churn model for telecom
 
Machine Learning and Deep Learning with R
Machine Learning and Deep Learning with RMachine Learning and Deep Learning with R
Machine Learning and Deep Learning with R
 
Churn Analysis in Telecom Industry
Churn Analysis in Telecom IndustryChurn Analysis in Telecom Industry
Churn Analysis in Telecom Industry
 
Customer Churn Analytics using Microsoft R Open
Customer Churn Analytics using Microsoft R OpenCustomer Churn Analytics using Microsoft R Open
Customer Churn Analytics using Microsoft R Open
 
AWS re:Invent 2016: Predicting Customer Churn with Amazon Machine Learning (M...
AWS re:Invent 2016: Predicting Customer Churn with Amazon Machine Learning (M...AWS re:Invent 2016: Predicting Customer Churn with Amazon Machine Learning (M...
AWS re:Invent 2016: Predicting Customer Churn with Amazon Machine Learning (M...
 
Beyond Churn Prediction : An Introduction to uplift modeling
Beyond Churn Prediction : An Introduction to uplift modelingBeyond Churn Prediction : An Introduction to uplift modeling
Beyond Churn Prediction : An Introduction to uplift modeling
 
Data analytics telecom churn final ppt
Data analytics telecom churn final ppt Data analytics telecom churn final ppt
Data analytics telecom churn final ppt
 
Churn Predictive Modelling
Churn Predictive ModellingChurn Predictive Modelling
Churn Predictive Modelling
 
Analytics, KPIs for effective Churn & Loyalty management
Analytics, KPIs for effective Churn & Loyalty managementAnalytics, KPIs for effective Churn & Loyalty management
Analytics, KPIs for effective Churn & Loyalty management
 
MDEC Data Matters Series: machine learning and Deep Learning, A Primer
MDEC Data Matters Series: machine learning and Deep Learning, A PrimerMDEC Data Matters Series: machine learning and Deep Learning, A Primer
MDEC Data Matters Series: machine learning and Deep Learning, A Primer
 
Churn modelling
Churn modellingChurn modelling
Churn modelling
 
Continuum Analytics and Python
Continuum Analytics and PythonContinuum Analytics and Python
Continuum Analytics and Python
 
Customer Churn, A Data Science Use Case in Telecom
Customer Churn, A Data Science Use Case in TelecomCustomer Churn, A Data Science Use Case in Telecom
Customer Churn, A Data Science Use Case in Telecom
 
Telecom Subscription, Churn and ARPU Analysis
Telecom Subscription, Churn and ARPU AnalysisTelecom Subscription, Churn and ARPU Analysis
Telecom Subscription, Churn and ARPU Analysis
 
Subscriber Churn Prediction Model using Social Network Analysis In Telecommun...
Subscriber Churn Prediction Model using Social Network Analysis In Telecommun...Subscriber Churn Prediction Model using Social Network Analysis In Telecommun...
Subscriber Churn Prediction Model using Social Network Analysis In Telecommun...
 
churn prediction in telecom
churn prediction in telecom churn prediction in telecom
churn prediction in telecom
 
Amazon Machine Learning Case Study: Predicting Customer Churn
Amazon Machine Learning Case Study: Predicting Customer ChurnAmazon Machine Learning Case Study: Predicting Customer Churn
Amazon Machine Learning Case Study: Predicting Customer Churn
 
Hadoop Interview Questions and Answers | Big Data Interview Questions | Hadoo...
Hadoop Interview Questions and Answers | Big Data Interview Questions | Hadoo...Hadoop Interview Questions and Answers | Big Data Interview Questions | Hadoo...
Hadoop Interview Questions and Answers | Big Data Interview Questions | Hadoo...
 
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
 

Similar to Data science in_action

Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
Simplilearn
 
Data+Science+in+Python+-+Data+Prep+&+EDA.pdf
Data+Science+in+Python+-+Data+Prep+&+EDA.pdfData+Science+in+Python+-+Data+Prep+&+EDA.pdf
Data+Science+in+Python+-+Data+Prep+&+EDA.pdf
neelakandan2001kpm
 
Introduction to Business and Data Analysis Undergraduate.pdf
Introduction to Business and Data Analysis Undergraduate.pdfIntroduction to Business and Data Analysis Undergraduate.pdf
Introduction to Business and Data Analysis Undergraduate.pdf
AbdulrahimShaibuIssa
 

Similar to Data science in_action (20)

Data science vs. Data scientist by Jothi Periasamy
Data science vs. Data scientist by Jothi PeriasamyData science vs. Data scientist by Jothi Periasamy
Data science vs. Data scientist by Jothi Periasamy
 
D92-198gstindspdx
D92-198gstindspdxD92-198gstindspdx
D92-198gstindspdx
 
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
 
How to Start Doing Data Science
How to Start Doing Data ScienceHow to Start Doing Data Science
How to Start Doing Data Science
 
Data+Science+in+Python+-+Data+Prep+&+EDA.pdf
Data+Science+in+Python+-+Data+Prep+&+EDA.pdfData+Science+in+Python+-+Data+Prep+&+EDA.pdf
Data+Science+in+Python+-+Data+Prep+&+EDA.pdf
 
Startds9.19.17sd
Startds9.19.17sdStartds9.19.17sd
Startds9.19.17sd
 
Getstarteddssd12717sd
Getstarteddssd12717sdGetstarteddssd12717sd
Getstarteddssd12717sd
 
Data Science Workshop - day 1
Data Science Workshop - day 1Data Science Workshop - day 1
Data Science Workshop - day 1
 
Data sci sd-11.6.17
Data sci sd-11.6.17Data sci sd-11.6.17
Data sci sd-11.6.17
 
Data Analytics Webinar for Aspirants
Data Analytics Webinar for AspirantsData Analytics Webinar for Aspirants
Data Analytics Webinar for Aspirants
 
Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science
 
Experience unparalleled data-driven success with our cutting-edge Data Scienc...
Experience unparalleled data-driven success with our cutting-edge Data Scienc...Experience unparalleled data-driven success with our cutting-edge Data Scienc...
Experience unparalleled data-driven success with our cutting-edge Data Scienc...
 
Dallas datascienceconference jasongeng-v3
Dallas datascienceconference jasongeng-v3Dallas datascienceconference jasongeng-v3
Dallas datascienceconference jasongeng-v3
 
Data Science Project Lifecycle and Skill Set
Data Science Project Lifecycle and Skill SetData Science Project Lifecycle and Skill Set
Data Science Project Lifecycle and Skill Set
 
Deck 92-146 (3)
Deck 92-146 (3)Deck 92-146 (3)
Deck 92-146 (3)
 
Introduction to Business and Data Analysis Undergraduate.pdf
Introduction to Business and Data Analysis Undergraduate.pdfIntroduction to Business and Data Analysis Undergraduate.pdf
Introduction to Business and Data Analysis Undergraduate.pdf
 
Introduction-to-Data-Science.pdf
Introduction-to-Data-Science.pdfIntroduction-to-Data-Science.pdf
Introduction-to-Data-Science.pdf
 
Introduction-to-Data-Science.pdf
Introduction-to-Data-Science.pdfIntroduction-to-Data-Science.pdf
Introduction-to-Data-Science.pdf
 
Data Analytics Course In Surat.pdf
Data Analytics Course In Surat.pdfData Analytics Course In Surat.pdf
Data Analytics Course In Surat.pdf
 
Career_Jobs_in_Data_Science.pptx
Career_Jobs_in_Data_Science.pptxCareer_Jobs_in_Data_Science.pptx
Career_Jobs_in_Data_Science.pptx
 

More from Ji Li (8)

Introduction to Anova
Introduction to AnovaIntroduction to Anova
Introduction to Anova
 
Preferred Parking Perplexity
Preferred Parking PerplexityPreferred Parking Perplexity
Preferred Parking Perplexity
 
Proof with Crayon
Proof with CrayonProof with Crayon
Proof with Crayon
 
Time Value of Money
Time Value of MoneyTime Value of Money
Time Value of Money
 
Arithmetic Product of Species
Arithmetic Product of SpeciesArithmetic Product of Species
Arithmetic Product of Species
 
Not Only About Mathematics Tutoring
Not Only About Mathematics TutoringNot Only About Mathematics Tutoring
Not Only About Mathematics Tutoring
 
Bi-Point-Determining Graphs
Bi-Point-Determining GraphsBi-Point-Determining Graphs
Bi-Point-Determining Graphs
 
Thesis Defense of Ji Li
Thesis Defense of Ji LiThesis Defense of Ji Li
Thesis Defense of Ji Li
 

Recently uploaded

Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
joachimlavalley1
 
Industrial Training Report- AKTU Industrial Training Report
Industrial Training Report- AKTU Industrial Training ReportIndustrial Training Report- AKTU Industrial Training Report
Industrial Training Report- AKTU Industrial Training Report
Avinash Rai
 

Recently uploaded (20)

Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
 
Danh sách HSG Bộ môn cấp trường - Cấp THPT.pdf
Danh sách HSG Bộ môn cấp trường - Cấp THPT.pdfDanh sách HSG Bộ môn cấp trường - Cấp THPT.pdf
Danh sách HSG Bộ môn cấp trường - Cấp THPT.pdf
 
NLC-2024-Orientation-for-RO-SDO (1).pptx
NLC-2024-Orientation-for-RO-SDO (1).pptxNLC-2024-Orientation-for-RO-SDO (1).pptx
NLC-2024-Orientation-for-RO-SDO (1).pptx
 
Fish and Chips - have they had their chips
Fish and Chips - have they had their chipsFish and Chips - have they had their chips
Fish and Chips - have they had their chips
 
UNIT – IV_PCI Complaints: Complaints and evaluation of complaints, Handling o...
UNIT – IV_PCI Complaints: Complaints and evaluation of complaints, Handling o...UNIT – IV_PCI Complaints: Complaints and evaluation of complaints, Handling o...
UNIT – IV_PCI Complaints: Complaints and evaluation of complaints, Handling o...
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
 
Industrial Training Report- AKTU Industrial Training Report
Industrial Training Report- AKTU Industrial Training ReportIndustrial Training Report- AKTU Industrial Training Report
Industrial Training Report- AKTU Industrial Training Report
 
Basic Civil Engg Notes_Chapter-6_Environment Pollution & Engineering
Basic Civil Engg Notes_Chapter-6_Environment Pollution & EngineeringBasic Civil Engg Notes_Chapter-6_Environment Pollution & Engineering
Basic Civil Engg Notes_Chapter-6_Environment Pollution & Engineering
 
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptxStudents, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
 
The Art Pastor's Guide to Sabbath | Steve Thomason
The Art Pastor's Guide to Sabbath | Steve ThomasonThe Art Pastor's Guide to Sabbath | Steve Thomason
The Art Pastor's Guide to Sabbath | Steve Thomason
 
Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
 
Solid waste management & Types of Basic civil Engineering notes by DJ Sir.pptx
Solid waste management & Types of Basic civil Engineering notes by DJ Sir.pptxSolid waste management & Types of Basic civil Engineering notes by DJ Sir.pptx
Solid waste management & Types of Basic civil Engineering notes by DJ Sir.pptx
 
Sectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdfSectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdf
 
Instructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptxInstructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptx
 
Operations Management - Book1.p - Dr. Abdulfatah A. Salem
Operations Management - Book1.p  - Dr. Abdulfatah A. SalemOperations Management - Book1.p  - Dr. Abdulfatah A. Salem
Operations Management - Book1.p - Dr. Abdulfatah A. Salem
 
How to Break the cycle of negative Thoughts
How to Break the cycle of negative ThoughtsHow to Break the cycle of negative Thoughts
How to Break the cycle of negative Thoughts
 
Basic Civil Engineering Notes of Chapter-6, Topic- Ecosystem, Biodiversity G...
Basic Civil Engineering Notes of Chapter-6,  Topic- Ecosystem, Biodiversity G...Basic Civil Engineering Notes of Chapter-6,  Topic- Ecosystem, Biodiversity G...
Basic Civil Engineering Notes of Chapter-6, Topic- Ecosystem, Biodiversity G...
 
NCERT Solutions Power Sharing Class 10 Notes pdf
NCERT Solutions Power Sharing Class 10 Notes pdfNCERT Solutions Power Sharing Class 10 Notes pdf
NCERT Solutions Power Sharing Class 10 Notes pdf
 
PART A. Introduction to Costumer Service
PART A. Introduction to Costumer ServicePART A. Introduction to Costumer Service
PART A. Introduction to Costumer Service
 

Data science in_action

  • 1. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Data Science in Action Ji Li Data Scientist March 25, 2015 Data Science In Action Ji Li
  • 2. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Overview Table of Contents 1 What is Data Science Overview Data Science Workflow 2 Churn Model Business Understanding Demo in R Top Features Prescriptive Analysis 3 Yesware Email Analysis Email Analysis 4 Data Science in the Industry Data Science Toolbox Big Data Data Science In Action Ji Li
  • 3. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Overview Data science on Google search Data Science In Action Ji Li
  • 4. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Overview Data science from my point of view Data Science In Action Ji Li
  • 5. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Data Science Workflow Table of Contents 1 What is Data Science Overview Data Science Workflow 2 Churn Model Business Understanding Demo in R Top Features Prescriptive Analysis 3 Yesware Email Analysis Email Analysis 4 Data Science in the Industry Data Science Toolbox Big Data Data Science In Action Ji Li
  • 6. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Data Science Workflow Domain expertise guides the workflow Data Science Workflow Data Munging Data Mining Delivery of actionable Insights Data Science In Action Ji Li
  • 7. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Data Science Workflow Data Munging Data Munging Data Munging means some or all of the following tasks: ETL Data Integration Data Cleansing Data Science In Action Ji Li
  • 8. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Data Science Workflow Data Munging Data Munging Data Munging means some or all of the following tasks: ETL Data Integration Data Cleansing ETL The process of extract, transform, and load data. To acquire data from external sources. To migrate multiple data sources internally. Data Science In Action Ji Li
  • 9. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Data Science Workflow Data Munging Data Munging Data Munging means some or all of the following tasks: ETL Data Integration Data Cleansing Data Integration To combine data from disparate sources into meaningful and valuable information. Data Science In Action Ji Li
  • 10. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Data Science Workflow Data Munging Data Munging Data Munging means some or all of the following tasks: ETL Data Integration Data Cleansing Data Cleansing Data cleansing, also called data scrubbing, is the process of amending or removing data in a database that is incorrect, incomplete, improperly formatted, or duplicated. Data Science In Action Ji Li
  • 11. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Data Science Workflow Data Mining Data Mining Data Mining is the key step to turn data into insights: Data Exploration Machine Learning Model Evaluation Data Science In Action Ji Li
  • 12. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Data Science Workflow Data Mining Data Mining Data Mining is the key step to turn data into insights: Data Exploration Machine Learning Model Evaluation Data Exploration The process of visually examine and explore the data. To gain basic understanding of the data. To identify relationships between different attributes. To answer basic questions using data. Data Science In Action Ji Li
  • 13. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Data Science Workflow Data Mining Data Mining Data Mining is the key step to turn data into insights: Data Exploration Machine Learning Model Evaluation Machine Learning To obtain statistical models, we usually need to go through multiple steps like the following: 1 Construct new features. 2 Remove redundant features. 3 Choose one or more suitable machine learning algorithm. Data Science In Action Ji Li
  • 14. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Data Science Workflow Data Mining Data Mining Data Mining is the key step to turn data into insights: Data Exploration Machine Learning Model Evaluation Model Evaluation Model evaluation is often used not only to select the best model from the set of models, but also to get ready for producing actionable insights. Data Science In Action Ji Li
  • 15. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Data Science Workflow Skills of a data scientist Data Science skills Programming Statistics Visualization Data Science In Action Ji Li
  • 16. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Data Science Workflow Doing data science Data Science In Action Ji Li
  • 17. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Business Understanding Table of Contents 1 What is Data Science Overview Data Science Workflow 2 Churn Model Business Understanding Demo in R Top Features Prescriptive Analysis 3 Yesware Email Analysis Email Analysis 4 Data Science in the Industry Data Science Toolbox Big Data Data Science In Action Ji Li
  • 18. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Business Understanding The Goal Goal To predict which customers will churn. To find ways to prevent customers from churning. Workflow Data Science In Action Ji Li
  • 19. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Business Understanding Customer renewal data Data Science In Action Ji Li
  • 20. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Business Understanding Customer service call data Data Science In Action Ji Li
  • 21. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Business Understanding Customer plan type and demographic Data Science In Action Ji Li
  • 22. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Business Understanding Customer usage data Data Science In Action Ji Li
  • 23. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Business Understanding Churn data structure Data Science In Action Ji Li
  • 24. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Demo in R Table of Contents 1 What is Data Science Overview Data Science Workflow 2 Churn Model Business Understanding Demo in R Top Features Prescriptive Analysis 3 Yesware Email Analysis Email Analysis 4 Data Science in the Industry Data Science Toolbox Big Data Data Science In Action Ji Li
  • 25. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Demo in R R Script http://goo.gl/IV3HDs Data Science In Action Ji Li
  • 26. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Top Features Table of Contents 1 What is Data Science Overview Data Science Workflow 2 Churn Model Business Understanding Demo in R Top Features Prescriptive Analysis 3 Yesware Email Analysis Email Analysis 4 Data Science in the Industry Data Science Toolbox Big Data Data Science In Action Ji Li
  • 27. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Top Features Variable importance from random forest Top Features cust surv calls is intl plan day mins Data Science In Action Ji Li
  • 28. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Top Features Top feature - customer service calls Data Science In Action Ji Li
  • 29. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Top Features Top feature - customer service calls Data Science In Action Ji Li
  • 30. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Top Features Top feature - customer day time minutes Data Science In Action Ji Li
  • 31. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Top Features Top feature - international plan and international calls Data Science In Action Ji Li
  • 32. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Prescriptive Analysis Table of Contents 1 What is Data Science Overview Data Science Workflow 2 Churn Model Business Understanding Demo in R Top Features Prescriptive Analysis 3 Yesware Email Analysis Email Analysis 4 Data Science in the Industry Data Science Toolbox Big Data Data Science In Action Ji Li
  • 33. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Prescriptive Analysis Churn analysis Data Science In Action Ji Li
  • 34. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Prescriptive Analysis Churn analysis Data Science In Action Ji Li
  • 35. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Email Analysis Table of Contents 1 What is Data Science Overview Data Science Workflow 2 Churn Model Business Understanding Demo in R Top Features Prescriptive Analysis 3 Yesware Email Analysis Email Analysis 4 Data Science in the Industry Data Science Toolbox Big Data Data Science In Action Ji Li
  • 36. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Email Analysis When to send an email To Optimize Email Reply Yesware users want to know how to get more replies. Data Science In Action Ji Li
  • 37. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Email Analysis When to send an email To Optimize Email Reply Yesware users want to know how to get more replies. An Email Reply Model 1 Construct features from the email data. 2 Create a model to predict the reply on each email. 3 Identify some top features that contributed most to the reply: Sent Hour Sent Weekday Data Science In Action Ji Li
  • 38. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Email Analysis When to send an email To Optimize Email Reply Yesware users want to know how to get more replies. Emails Sent Volume and Reply Rate by Sent Hour Data Science In Action Ji Li
  • 39. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Email Analysis When to send an email To Optimize Email Reply Yesware users want to know how to get more replies. Emails Sent Volume and Reply Rate by Sent Hour Emails Sent Volume and Reply Rate by Sent Weekday Data Science In Action Ji Li
  • 40. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Email Analysis D3 Visualization: Subject Line Keywords Subject line keywords: http://goo.gl/PK9xhO Data Science In Action Ji Li
  • 41. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Data Science Toolbox Table of Contents 1 What is Data Science Overview Data Science Workflow 2 Churn Model Business Understanding Demo in R Top Features Prescriptive Analysis 3 Yesware Email Analysis Email Analysis 4 Data Science in the Industry Data Science Toolbox Big Data Data Science In Action Ji Li
  • 42. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Data Science Toolbox Why do we need a toolbox? Data Science In Action Ji Li
  • 43. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Data Science Toolbox Tools that helped me do data science fast Data Science In Action Ji Li
  • 44. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Data Science Toolbox Data visualization tools D3: http://d3js.org/ Data Science In Action Ji Li
  • 45. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Big Data Table of Contents 1 What is Data Science Overview Data Science Workflow 2 Churn Model Business Understanding Demo in R Top Features Prescriptive Analysis 3 Yesware Email Analysis Email Analysis 4 Data Science in the Industry Data Science Toolbox Big Data Data Science In Action Ji Li
  • 46. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Big Data Big Data Big Data Ecosystem Hadoop - file system Data Science In Action Ji Li
  • 47. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Big Data Big Data Big Data Ecosystem Hadoop - file system Spark - computing system Data Science In Action Ji Li
  • 48. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Big Data Big Data Big Data Ecosystem Hadoop - file system Spark - computing system Spark Stack https://spark.apache.org/ Data Science In Action Ji Li
  • 49. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Big Data Big Data Big Data Ecosystem Hadoop - file system Spark - computing system Spark Stack Spark SQL - Data Munging Spark Streaming - Real Time Processing MLlib - Machine Learning GraphX - Visualization https://spark.apache.org/ Data Science In Action Ji Li
  • 50. What is Data Science Churn Model Yesware Email Analysis Data Science in the Industry Big Data Thank You Please send your questions and feedback to me! Data Science In Action Ji Li