Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Eiti Kimura / Flávio Clésio
@Movile Brazil, June
PRACTICAL MACHINE LEARNING MODELS TO
PREVENT REVENUE LOSS
PAPIS.io Connec...
ABOUT US
Flávio Clésio
• Core Machine Learning at Movile
• MSc. in Production Engineering (Machine Learning in Credit Deri...
ABOUT US
• Software Architect and TI Coordinator at Movile
• Msc. in Electrical Engineering
• Apache Cassandra MVP (2014/2...
Movile is the company behind of several
apps that makes the life easier
WE MAKE LIFE BETTER
THROUGH OUR APPS
● The Movile's Platform Case
● Practical Machine Learning Model Training
● Results and Conclusions
AGENDA
MOVILE'S SUBSCRIPTION
AND BILLING PLATFORM IN ITS SIMPLEST FORM
● A distributed platform
● User's subscription management
...
REVENUE LEAKAGE LOSSES IMPACT ALONG THE DAY
● 1 Hour Outage: Dozens of thousands of BRL
● 3 Hour Outage: Some hundreds of ...
How can we check if platform is fully functional
based on data analysis only?
MAIN PROBLEM: MONITORING
Tip: what if we ask...
● 230 Millions + of billing requests attempt a day
● 4 main mobile carriers drive the operational work
HOW DATA LOOKS LIKE?
DATA AND ALGORITHM MODELING
Sample of data (predicting the number of success)
featureslabel/target
# success carrier_weigh...
● Linear Model with Stochastic Gradient (SDG)
● Lasso with SGD Model (L1 Regularization)
● Ridge Regression with SGD Model...
MODELING LIFECYCLE
Training Data
Testing Data
Feature
Extraction
Train
Score
Model
Evaluation
Dataset
http://spark-notebook.io/
SPARK NOTEBOOK
Machine Learning Tested Model Accuracy RMSE
Linear Model With SGD 44% 1.64
Lasso with SGD Model 43% 1.65
Ridge Regression ...
WATCHER-AI INTRODUCTION
Hi I'm Watcher-ai!
It is nice to see you here
WATCHER-AI ARCHITECTURE
Avoid to Lose more than
U$ 3 Million Dollars preventing leakage
Saved more than 500
working hours
Recovery Time drops from...
• Automatic refeed and training using collected data. Analyse more data to predict
possible errors with carrier
• Notify m...
THANK YOU!
@eitikimura eiti-kimura-movile eiti.kimura@movile.com
@flavioclesio fclesio flavio.clesio@movile.com
github.com...
Upcoming SlideShare
Loading in …5
×

Practical machine learning models to prevent revenue loss — Eiti Kimura and Flávio Clésio (movile) @PAPIs Connect — São Paulo 2017

511 views

Published on

Nowadays with high data volumes, there’s a need to develop intelligent systems that can assist in data analysis and decision making. We offer a practical demonstration of machine learning to create an intelligent application based on distributed system data. We'll show machine learning techniques in the development of a data analysis application to monitor distributed platforms with direct impact on company revenue, saving more than 3M dollars a year. Also, we will provide a source code of a practical demonstration on how to train machine learning models and perform predictions with Apache Spark.

Published in: Technology
  • Be the first to comment

Practical machine learning models to prevent revenue loss — Eiti Kimura and Flávio Clésio (movile) @PAPIs Connect — São Paulo 2017

  1. 1. Eiti Kimura / Flávio Clésio @Movile Brazil, June PRACTICAL MACHINE LEARNING MODELS TO PREVENT REVENUE LOSS PAPIS.io Connect 2017
  2. 2. ABOUT US Flávio Clésio • Core Machine Learning at Movile • MSc. in Production Engineering (Machine Learning in Credit Derivatives/NPL) • Specialist in Database Engineering and Business Intelligence • Blogger at Mineração de Dados (Data Mining) - http://mineracaodedados.wordpress.com • Strata Hadoop World Singapore Speaker flavioclesio
  3. 3. ABOUT US • Software Architect and TI Coordinator at Movile • Msc. in Electrical Engineering • Apache Cassandra MVP (2014/2015 e 2015/2016) • Apache Cassandra Contributor (2015) • Cassandra Summit Speaker (2014 e 2015) • Strata Hadoop World Singapore Speaker Eiti Kimura eitikimura
  4. 4. Movile is the company behind of several apps that makes the life easier WE MAKE LIFE BETTER THROUGH OUR APPS
  5. 5. ● The Movile's Platform Case ● Practical Machine Learning Model Training ● Results and Conclusions AGENDA
  6. 6. MOVILE'S SUBSCRIPTION AND BILLING PLATFORM IN ITS SIMPLEST FORM ● A distributed platform ● User's subscription management ● MISSION CRITICAL platform: can not stop under any circumstance
  7. 7. REVENUE LEAKAGE LOSSES IMPACT ALONG THE DAY ● 1 Hour Outage: Dozens of thousands of BRL ● 3 Hour Outage: Some hundreds of thousands of BRL ● 12 Hour Outage: Millions of BRL
  8. 8. How can we check if platform is fully functional based on data analysis only? MAIN PROBLEM: MONITORING Tip: what if we ask help to an intelligent system?
  9. 9. ● 230 Millions + of billing requests attempt a day ● 4 main mobile carriers drive the operational work HOW DATA LOOKS LIKE?
  10. 10. DATA AND ALGORITHM MODELING Sample of data (predicting the number of success) featureslabel/target # success carrier_weight hour week response_time #no_credit #errors # attempts 61.083, [4.0, 17h, 3.0, 1259.0, 24.751.650, 2.193.67, 26.314.551] SUPERVISED LEARNING Linear Regression
  11. 11. ● Linear Model with Stochastic Gradient (SDG) ● Lasso with SGD Model (L1 Regularization) ● Ridge Regression with SGD Model (L2 Regularization) ● Decision Tree with Regression Properties (Not CART) TESTED ALGORITHMS USING SPARK MLLIB
  12. 12. MODELING LIFECYCLE Training Data Testing Data Feature Extraction Train Score Model Evaluation Dataset
  13. 13. http://spark-notebook.io/ SPARK NOTEBOOK
  14. 14. Machine Learning Tested Model Accuracy RMSE Linear Model With SGD 44% 1.64 Lasso with SGD Model 43% 1.65 Ridge Regression with SGD Model 43% 1.66 Decision Tree Model 96% 0.20 TIME TO EVALUATE THE RESULTS
  15. 15. WATCHER-AI INTRODUCTION Hi I'm Watcher-ai! It is nice to see you here
  16. 16. WATCHER-AI ARCHITECTURE
  17. 17. Avoid to Lose more than U$ 3 Million Dollars preventing leakage Saved more than 500 working hours Recovery Time drops from 6 hours to 1 hour
  18. 18. • Automatic refeed and training using collected data. Analyse more data to predict possible errors with carrier • Notify more people and specific teams (more complex problems) • Refactory to be more generic, so other teams can add their own algorithm • Why not interact with Watcher to guide the analysis? • Don't limit your mind, there is a lot to keep improving... IT IS JUST A WARM UP
  19. 19. THANK YOU! @eitikimura eiti-kimura-movile eiti.kimura@movile.com @flavioclesio fclesio flavio.clesio@movile.com github.com/fclesio/watcher-ai-samples papis-2017-brazil

×