Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
What to Upload to SlideShare
Next
Download to read offline and view in fullscreen.

1

Share

Download to read offline

Telecom Data Analytics

Download to read offline

Fifth Elephant 2019 Talk Proposal

Related Books

Free with a 30 day trial from Scribd

See all

Telecom Data Analytics

  1. 1. Story of Building a Telecom Data Solution Sawinder Pal Kaur, PhD Data Scientist, SAP Labs
  2. 2. Outline 1. Define business objectives and translating business problem into data science problem 2. Introduction to Telecom data - data scale, volume, continuous and categorical variables, static and dynamic data 3. Architecture and data processing pipeline: Big data handling and data science methods for Categorical feature selection 4. Solution Engineering: How to keep project managers do feature selection and identify the opportunities to optimize the existing plans and services?
  3. 3. Business Objective
  4. 4. Business Objective • Personalize recommendation • More customer satisfaction • Improved Customer retention • Increased frequency of selling • Better mix of products • Increased customer loyalty • Better decision on coupons and discounts • Develop effective strategy for new product launches • Better offers to specific customer profile • Better product design / pricing • Improve quality of service for highest margin customers • Invest where highest margin customers are using the network resources Recommend Plans and Services Grouping/ Clustering Identify Profit Maximization Opportunities
  5. 5. Telecom Data & Data Processing Pipeline
  6. 6. Data • How much data is available? • Data infrastructure • Data dashboards • Data preparation for Machine learning • Data protection and privacy
  7. 7. Partitioning the data into similar groups Multi dimensional clustering Grouping customers- One dimensional binning/clustering
  8. 8. High, low, and normal profitable customers - One dimensional outlier detection Multi dimensional outlier detection
  9. 9. • Dealing with missing – • Delete the rows with missing • Replace missing using • mean/median • Other number • Conditional mean • Model like K nearest neighborhood
  10. 10. • Filter Methods – used as independent feature selection e.g. Pearson correlation, Mutual Information, MRMR • Dimensionality reduction – PCA, Variational autoencoder • Feature Engineering • Creating new variables – Polynomials, Interaction variables, Ratios • Wrapper and Embedded methods - used in the model building process Feature selection Base set Learning Model Performance
  11. 11. Business Insights
  12. 12. Cluster Size Revenue Profit Usage Discount Cost 1 1283 0.05 -0.24 0.90 0.23 0.46 2 582 -0.13 -0.05 -0.15 -1.87 -0.10 3 71 -0.28 -0.55 0.05 -8.07 0.46 4 5309 -0.17 -0.01 -0.37 0.25 -0.25 5 9 19.37 16.26 1.12 -0.06 3.03 6 222 0.10 -1.19 3.66 0.13 2.06 7 270 2.75 2.35 0.11 0.08 0.36 8 8 0.64 -12.55 6.61 0.25 20.97 Revenue, profit and cost is very high Profit is very low profit and cost and volume are very high
  • deepakts

    Jul. 30, 2019

Fifth Elephant 2019 Talk Proposal

Views

Total views

76

On Slideshare

0

From embeds

0

Number of embeds

0

Actions

Downloads

0

Shares

0

Comments

0

Likes

1

×