Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

SDD2017 - 03 Abed Ajraou - putting data science in your business a first utility feedback

104 views

Published on

SDD2017 - 03 Abed Ajraou - putting data science in your business a first utility feedback

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

SDD2017 - 03 Abed Ajraou - putting data science in your business a first utility feedback

  1. 1. Abed Ajraou – Director of Data & Insights & Lead Data Scientist @First Utility Putting Data Science in Your Business: a First Utility Feedback
  2. 2. First Utility – Putting customers in control; saving them money Cheaper tariffs Great service More knowledge
  3. 3. Driving the Success of DS Solutions : Skills, Roles and Responsibilities
  4. 4. Source: https://whatsthebigdata.com/2016/05/01/data-scientists-spend-most-of-their-time-cleaning-data/ What have we missed here … ?
  5. 5. Right Technology
  6. 6. Data – THE NEW POWER Individual Transaction-Level Data Industry Data Internal Data Data & Insights Platform Delivering Business Values for our clients Data for Products and Operational Process Data for Dashboarding and Business Decisions Data for Predictive Analytics Allow us to deliver a better service for our customers Allow us to optimise the business and give the better price to our customers Allow us to give more knowledge to our customers
  7. 7. Industry Data Individual Transaction-Level Data Internal Data  Better Agility  Data Lake and Data Warehousing in the same platform  Enable Data Discovery  Collect more data  Analyse the data with high performance  Next Gen of Data Visualisation on top of Hadoop
  8. 8. Right Mind-set
  9. 9. Start with a business problem Not considering the business outcome, it’s actually the first reason of project failure!
  10. 10. Start with a business problem
  11. 11. Starting with the data and not with the question … ?
  12. 12. Right Methodology
  13. 13. Explore the data ● Exploratory Analysis by Visualizing the data
  14. 14. The creativity part and lot of trial / error process. Feature engineering Andrew Fogg win the competition by categorising the colours of cars.
  15. 15. ● ML is often used in DS ● Currently, the buzz/trend ML is xgboost which gives most of the time better result than the traditional Random Forest & Neural Networks. ● Reason of the success? More Accurate, more efficient, easy to use, customized and distributed. ● Need less spending time in Feature engineering but still need some creativity. Models to predict
  16. 16. Models to predict: gradient boosting
  17. 17. ● ML is often used in DS ● Currently, the buzz/trend ML is xgboost which gives most of the time better result than the traditional Random Forest & Neural Networks. ● Reason of the success? More Accurate, more efficient, easy to use, customized and distributed. ● Need less spending time in Feature engineering but still need some creativity. Models to predict
  18. 18. Evaluation - validations ● Overfitting/Underfitting is the biggest fear of a Data Scientist. ● Cross validation is one way to protect the model to not overfit
  19. 19. Feedback loop ● ML algorithm is a life system … like any life specimen, it needs cares !!! ● Learning by his mistakes, it’s the only way to progress and to fit a real AI model.
  20. 20. Bad Methodology Main reasons: • No clear business case • Try to create the best accurate model in the first place • No agility • No code version control
  21. 21. An iterative delivery is key Sprint 1 Sprint 2 Main take away: • Agility is required • Weekly delivered is highly recommended to avoid falling to the “tunnel effect”
  22. 22. Going forward: AML Automated Machine learning
  23. 23. Gartner Says “More Than 40 Percent of Data Science Tasks Will Be Automated by 2020” Source: https://www.gartner.com/newsroom/id/3570917 Automation in Machine Learning is starting
  24. 24. Gain in Efficiency ● In the old age of BI world, we gain in efficiency by using ETL tool rather than scripting codes. However, ML is often associate with R/Python/Scala coding.
  25. 25. Dataiku Flow => enable AML My favorite app The Collaborative Data Science Platform: Dataiku
  26. 26. Data Science is nothing without a team
  27. 27. Data Science is a range of skills ! It’s quite rare to get them in a single person Source: Dsradar.com
  28. 28. Thank you for your attention Any Questions? Keep contact: @AAjraou

×