Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Better Customer Experience with Data Science - Bernard Burg, Comcast

889 views

Published on

Comcast is the third-largest Internet provider worldwide, managing massive networks which deliver connectivity and streaming content to millions of customers. Such networks face complex maintenance and troubleshooting issues. We use Machine Learning to analyze and model error patterns to continuously assess the health of our network and ensure a smooth experience for every user. This is supported with a Decision Engine, which can be configured to take appropriate remedial actions such as customer notifications and self-healing directives.

We describe the architecture capable of scaling and handling billions of events per day and explain how H20 helps to implement the underlying learning models. We illustrate the superiority of H2O algorithms in all of the following: accuracy, speed and memory footprint with comparisons to other systems such as Spark ML. #h2ony

Published in: Data & Analytics
  • Be the first to comment

Better Customer Experience with Data Science - Bernard Burg, Comcast

  1. 1. Better Customer Experience with Data Science (just add water) Bernard Burg Comcast bernard_burg@comcast.com 7/19/16 H2O Open Tour 2016, New York 1
  2. 2. XFINITY TV XFINITY Internet XFINITY Voice XFINITY Home Digital & OtherOther *Minority interest and/or non-controlling interest. Slide is not comprehensive of all Comcast NBCUniversal assets Updated: December 22, 2015
  3. 3. Complex Troubleshooting • Failure scenario – Customer orders a Video-on-Demand – Transaction fails, customer care call initiated • Consequences – Unhappy customer: no visibility or opportunity to mitigate issue – Potentially avoidable phone call • Numerous potential reasons for failure – Billing – Resource unavailable – Service issue – Hardware issue (set-top box or router) – Software issue – Parental control settings 7/19/16 H2O Open Tour 2016, New York 3
  4. 4. Analysis • What brought the customer to this point? – Call records – Billing history – Events generated by hardware – Upstream outages – Usage spikes • What’s the best course of action now? • How can we predict such issues? 7/19/16 H2O Open Tour 2016, New York 4
  5. 5. Project Goals 7/19/16 H2O Open Tour 2016, New York 5 Improve Customer Experience • Keep our customers informed • Empower our CARE agents – Timely, accurate, complete information & context – Smart recommendations • Higher first call resolution Maximize Efficiency • Customer self service – Fewer calls & truck rolls • Self Assisted-healing equipment
  6. 6. Goal of Data Science 7/19/16 H2O Open Tour 2016, New York 6 Each user’s set top boxes sends up to 150+ different codes of error messages, at any time: Goal 1: predict if a user will call Goal 2: predict why they call
  7. 7. Predicting User Calls Using Error Model Alone Data science Gradient Boosting Machine 66% accuracy Temporal model The algorithm reached a glass ceiling calls no-calls Using Error + User Behavior Models Data science Gradient Boosting Machine 79% accuracy Temporal model Behavior model calls no-calls no-calls 7/19/16 H2O Open Tour 2016, New York 7
  8. 8. Predicting Why Users Call A Single Algorithm Predicting 10 Buckets Data science Gradient Boosting Machine 47% accuracy is not great but is about 5 times better than random Temporal model 7/19/16 H2O Open Tour 2016, New York 8 Spark ML H2O Accuracy 42% 47% Processing time 10 minutes 2 minutes Memory Limited size of test No limit reached Ease of use Program dataFrame UI
  9. 9. Very easy to make in sparkling Water: Map enum to n binary buckets 7/19/16 H2O Open Tour 2016, New York 9 Predicting Why Users Call 10 Specialized Algorithms Predicting 10 Buckets 10 binary buckets
  10. 10. Predicting Why Users Call 10 Specialized Algorithms Predicting 10 Buckets Data science Gradient Boosting MachineTemporal model 7/19/16 H2O Open Tour 2016, New York 10 Accuracy SparkML H2O H2O’s gain Bucket 0: activations 97% 99% 2% Bucket 1: appointment 97% 99% 2% Bucket 2: billing 84% 86% 2% Bucket 3: op-3 90% 93% 3% Bucket 4: op-4 85% 90% 5% Bucket 5: op-5 99% 99% 0% Bucket 6: op-6 98% 100% 2% Bucket 7: op-7 80% 82% 2% Bucket 8: op-8 93% 97% 4% Bucket 9: technical 66% 87% 21% Average Accuracy 89% 95% 6%
  11. 11. Predicting Why Users Call Looks good but… Data science Gradient Boosting MachineTemporal model 7/19/16 H2O Open Tour 2016, New York 11 Accuracy SparkML H2O H2O’s gain Bucket 0: activations 97% 99% 2% Bucket 1: appointment 97% 99% 2% Bucket 2: billing 84% 86% 2% Bucket 3: op-3 90% 93% 3% Bucket 4: op-4 85% 90% 5% Bucket 5: op-5 99% 99% 0% Bucket 6: op-6 98% 100% 2% Bucket 7: op-7 80% 82% 2% Bucket 8: op-8 93% 97% 4% Bucket 9: technical 66% 87% 21% Average Accuracy 89% 95% 6% Data science Gradient Boosting Machine Spark ML H2O Accuracy ? 60% Processing time 10 * 10 minutes 11 * 2 minutes Memory Limited size of test No limit reached Ease of use Program dataFrame UI Why this drop from 95% to 60%
  12. 12. Learning 10 Specialized Algorithms in H2O 7/19/16 H2O Open Tour 2016, New York 12 Predicting Why Users Call
  13. 13. Overlapping Buckets 7/19/16 H2O Open Tour 2016, New York 13 Hope given by a 95% composite precision of the 10 binary algorithms did not materialize because of overlapping classes misclassifying elements as shown in ROC (Receiver Operating characteristic) charts as drawn by H2O false positive false positive truepositivetruepositive
  14. 14. Forecasting Improvements with H20 7/19/16 H2O Open Tour 2016, New York 14 • Hypothesis case 1: B2:billing can be predicted with 100% accuracy • The overall prediction model would jump to : 75% accuracy Replace Estimatio n by result
  15. 15. Forecasting Improvements 7/19/16 H2O Open Tour 2016, New York 15 • By fixing one of the problematic buckets: • The overall prediction model would jump to : 75% accuracy • By fixing both problematic buckets: • The overall prediction model would jump to : 86% accuracy These simple forecasts are worth gold, as they allow us to focus on the essential (out of 1000’s of parameters)
  16. 16. Conclusion 7/19/16 H2O Open Tour 2016, New York 16 Choice to switch to H20 was simple • Superior results (accuracy) • Faster algorithms (factor 3) • Better use of memory • Accelerated studies because of – Input UI allowing to select/deselect columns – Very smart output UI (ROC, influent parameters…) • Stable and reliable algorithms Room for improvement: • Sparkling water interface showed some instabilities • We designed around it by generating csv files

×