SlideShare a Scribd company logo
Story of Building a Telecom
Data Solution
Sawinder Pal Kaur, PhD
Data Scientist, SAP Labs
Outline
1. Define business objectives and translating business
problem into data science problem
2. Introduction to Telecom data - data scale, volume,
continuous and categorical variables, static and dynamic
data
3. Architecture and data processing pipeline: Big data
handling and data science methods for Categorical
feature selection
4. Solution Engineering: How to keep project managers do
feature selection and identify the opportunities to
optimize the existing plans and services?
Business Objective
Business Objective
• Personalize
recommendation
• More customer satisfaction
• Improved Customer
retention
• Increased frequency of
selling
• Better mix of products
• Increased customer loyalty
• Better decision on coupons
and discounts
• Develop effective strategy for
new product launches
• Better offers to specific
customer profile
• Better product design /
pricing
• Improve quality of service
for highest margin
customers
• Invest where highest
margin customers are
using the network
resources
Recommend
Plans and Services
Grouping/
Clustering
Identify Profit
Maximization
Opportunities
Telecom Data &
Data Processing
Pipeline
Data
• How much data is available?
• Data infrastructure
• Data dashboards
• Data preparation for
Machine learning
• Data protection and privacy
Partitioning the data into similar groups
Multi dimensional clustering
Grouping customers-
One dimensional
binning/clustering
High, low, and normal
profitable customers -
One dimensional outlier
detection
Multi dimensional outlier detection
• Dealing with missing –
• Delete the rows with missing
• Replace missing using
• mean/median
• Other number
• Conditional mean
• Model like K nearest neighborhood
• Filter Methods – used as independent feature selection e.g.
Pearson correlation, Mutual Information, MRMR
• Dimensionality reduction – PCA, Variational autoencoder
• Feature Engineering
• Creating new variables – Polynomials, Interaction variables, Ratios
• Wrapper and Embedded methods - used in the model building
process
Feature
selection
Base set
Learning
Model
Performance
Business Insights
Cluster Size Revenue Profit Usage Discount Cost
1 1283 0.05 -0.24 0.90 0.23 0.46
2 582 -0.13 -0.05 -0.15 -1.87 -0.10
3 71 -0.28 -0.55 0.05 -8.07 0.46
4 5309 -0.17 -0.01 -0.37 0.25 -0.25
5 9 19.37 16.26 1.12 -0.06 3.03
6 222 0.10 -1.19 3.66 0.13 2.06
7 270 2.75 2.35 0.11 0.08 0.36
8 8 0.64 -12.55 6.61 0.25 20.97
Revenue, profit and
cost is
very high
Profit is very low
profit and cost and
volume are very high
Telecom Data Analytics

More Related Content

Similar to Telecom Data Analytics

Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Barga Galvanize Sept 2015
Barga Galvanize Sept 2015
Roger Barga
 
Machine intelligence data science methodology 060420
Machine intelligence data science methodology 060420Machine intelligence data science methodology 060420
Machine intelligence data science methodology 060420
Jeremy Lehman
 
Tata steel ideation contest
Tata steel ideation contestTata steel ideation contest
Tata steel ideation contestashwinikumar1424
 
RowanDay3.pptx
RowanDay3.pptxRowanDay3.pptx
RowanDay3.pptx
MattMarino13
 
AI Class Topic 3: Building Machine Learning Predictive Systems (Predictive Ma...
AI Class Topic 3: Building Machine Learning Predictive Systems (Predictive Ma...AI Class Topic 3: Building Machine Learning Predictive Systems (Predictive Ma...
AI Class Topic 3: Building Machine Learning Predictive Systems (Predictive Ma...
Value Amplify Consulting
 
PrADS Introduction & offerings 2017
PrADS Introduction & offerings 2017 PrADS Introduction & offerings 2017
PrADS Introduction & offerings 2017
Kiran Kumar Muthyala
 
Business intelligence prof nikhat fatma mumtaz husain shaikh
Business intelligence  prof nikhat fatma mumtaz husain shaikhBusiness intelligence  prof nikhat fatma mumtaz husain shaikh
Business intelligence prof nikhat fatma mumtaz husain shaikh
Nikhat Fatma Mumtaz Husain Shaikh
 
finalestkddfinalpresentation-111207021040-phpapp01.pptx
finalestkddfinalpresentation-111207021040-phpapp01.pptxfinalestkddfinalpresentation-111207021040-phpapp01.pptx
finalestkddfinalpresentation-111207021040-phpapp01.pptx
shumPanwar
 
Day 1 (Lecture 2): Business Analytics
Day 1 (Lecture 2): Business AnalyticsDay 1 (Lecture 2): Business Analytics
Day 1 (Lecture 2): Business Analytics
Aseda Owusua Addai-Deseh
 
Deliveinrg explainable AI
Deliveinrg explainable AIDeliveinrg explainable AI
Deliveinrg explainable AI
Gary Allemann
 
Is Your Marketing Database "Model Ready"?
Is Your Marketing Database "Model Ready"?Is Your Marketing Database "Model Ready"?
Is Your Marketing Database "Model Ready"?Vivastream
 
Deep learning
Deep learningDeep learning
Deep learning
Arun Shukla
 
Data Science Introduction by Emerging India Analytics
Data Science Introduction by Emerging India AnalyticsData Science Introduction by Emerging India Analytics
Data Science Introduction by Emerging India Analytics
AyeshaSharma29
 
Group 1 Report CRISP - DM METHODOLOGY.pptx
Group 1 Report CRISP - DM METHODOLOGY.pptxGroup 1 Report CRISP - DM METHODOLOGY.pptx
Group 1 Report CRISP - DM METHODOLOGY.pptx
ellamangapis2003
 
HashCash big data services
HashCash big data servicesHashCash big data services
HashCash big data services
HashCash Consultants
 
Technology to decision analysis
Technology to decision analysisTechnology to decision analysis
Technology to decision analysis
John Frias Morales, DrBA, MS
 
Data Refinement: The missing link between data collection and decisions
Data Refinement: The missing link between data collection and decisionsData Refinement: The missing link between data collection and decisions
Data Refinement: The missing link between data collection and decisionsVivastream
 

Similar to Telecom Data Analytics (20)

Data mining
Data miningData mining
Data mining
 
Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Barga Galvanize Sept 2015
Barga Galvanize Sept 2015
 
Machine intelligence data science methodology 060420
Machine intelligence data science methodology 060420Machine intelligence data science methodology 060420
Machine intelligence data science methodology 060420
 
Tata steel ideation
Tata steel ideationTata steel ideation
Tata steel ideation
 
Tata steel ideation contest
Tata steel ideation contestTata steel ideation contest
Tata steel ideation contest
 
Tata steel ideation contest
Tata steel ideation contestTata steel ideation contest
Tata steel ideation contest
 
RowanDay3.pptx
RowanDay3.pptxRowanDay3.pptx
RowanDay3.pptx
 
AI Class Topic 3: Building Machine Learning Predictive Systems (Predictive Ma...
AI Class Topic 3: Building Machine Learning Predictive Systems (Predictive Ma...AI Class Topic 3: Building Machine Learning Predictive Systems (Predictive Ma...
AI Class Topic 3: Building Machine Learning Predictive Systems (Predictive Ma...
 
PrADS Introduction & offerings 2017
PrADS Introduction & offerings 2017 PrADS Introduction & offerings 2017
PrADS Introduction & offerings 2017
 
Business intelligence prof nikhat fatma mumtaz husain shaikh
Business intelligence  prof nikhat fatma mumtaz husain shaikhBusiness intelligence  prof nikhat fatma mumtaz husain shaikh
Business intelligence prof nikhat fatma mumtaz husain shaikh
 
finalestkddfinalpresentation-111207021040-phpapp01.pptx
finalestkddfinalpresentation-111207021040-phpapp01.pptxfinalestkddfinalpresentation-111207021040-phpapp01.pptx
finalestkddfinalpresentation-111207021040-phpapp01.pptx
 
Day 1 (Lecture 2): Business Analytics
Day 1 (Lecture 2): Business AnalyticsDay 1 (Lecture 2): Business Analytics
Day 1 (Lecture 2): Business Analytics
 
Deliveinrg explainable AI
Deliveinrg explainable AIDeliveinrg explainable AI
Deliveinrg explainable AI
 
Is Your Marketing Database "Model Ready"?
Is Your Marketing Database "Model Ready"?Is Your Marketing Database "Model Ready"?
Is Your Marketing Database "Model Ready"?
 
Deep learning
Deep learningDeep learning
Deep learning
 
Data Science Introduction by Emerging India Analytics
Data Science Introduction by Emerging India AnalyticsData Science Introduction by Emerging India Analytics
Data Science Introduction by Emerging India Analytics
 
Group 1 Report CRISP - DM METHODOLOGY.pptx
Group 1 Report CRISP - DM METHODOLOGY.pptxGroup 1 Report CRISP - DM METHODOLOGY.pptx
Group 1 Report CRISP - DM METHODOLOGY.pptx
 
HashCash big data services
HashCash big data servicesHashCash big data services
HashCash big data services
 
Technology to decision analysis
Technology to decision analysisTechnology to decision analysis
Technology to decision analysis
 
Data Refinement: The missing link between data collection and decisions
Data Refinement: The missing link between data collection and decisionsData Refinement: The missing link between data collection and decisions
Data Refinement: The missing link between data collection and decisions
 

Recently uploaded

一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 

Recently uploaded (20)

一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 

Telecom Data Analytics

  • 1. Story of Building a Telecom Data Solution Sawinder Pal Kaur, PhD Data Scientist, SAP Labs
  • 2. Outline 1. Define business objectives and translating business problem into data science problem 2. Introduction to Telecom data - data scale, volume, continuous and categorical variables, static and dynamic data 3. Architecture and data processing pipeline: Big data handling and data science methods for Categorical feature selection 4. Solution Engineering: How to keep project managers do feature selection and identify the opportunities to optimize the existing plans and services?
  • 4. Business Objective • Personalize recommendation • More customer satisfaction • Improved Customer retention • Increased frequency of selling • Better mix of products • Increased customer loyalty • Better decision on coupons and discounts • Develop effective strategy for new product launches • Better offers to specific customer profile • Better product design / pricing • Improve quality of service for highest margin customers • Invest where highest margin customers are using the network resources Recommend Plans and Services Grouping/ Clustering Identify Profit Maximization Opportunities
  • 5. Telecom Data & Data Processing Pipeline
  • 6. Data • How much data is available? • Data infrastructure • Data dashboards • Data preparation for Machine learning • Data protection and privacy
  • 7. Partitioning the data into similar groups Multi dimensional clustering Grouping customers- One dimensional binning/clustering
  • 8. High, low, and normal profitable customers - One dimensional outlier detection Multi dimensional outlier detection
  • 9. • Dealing with missing – • Delete the rows with missing • Replace missing using • mean/median • Other number • Conditional mean • Model like K nearest neighborhood
  • 10. • Filter Methods – used as independent feature selection e.g. Pearson correlation, Mutual Information, MRMR • Dimensionality reduction – PCA, Variational autoencoder • Feature Engineering • Creating new variables – Polynomials, Interaction variables, Ratios • Wrapper and Embedded methods - used in the model building process Feature selection Base set Learning Model Performance
  • 12. Cluster Size Revenue Profit Usage Discount Cost 1 1283 0.05 -0.24 0.90 0.23 0.46 2 582 -0.13 -0.05 -0.15 -1.87 -0.10 3 71 -0.28 -0.55 0.05 -8.07 0.46 4 5309 -0.17 -0.01 -0.37 0.25 -0.25 5 9 19.37 16.26 1.12 -0.06 3.03 6 222 0.10 -1.19 3.66 0.13 2.06 7 270 2.75 2.35 0.11 0.08 0.36 8 8 0.64 -12.55 6.61 0.25 20.97 Revenue, profit and cost is very high Profit is very low profit and cost and volume are very high