Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

idalab seminar #6 ‘Lean’ Training Data: An Incremental Approach to Supervised Machine Learning

87 views

Published on

Machine learning in a young start up does not look like a Kaggle competition. Data science projects start with a more extensive roadmap than dataset. In the absence of data, subject-matter knowledge makes heuristic solutions a tempting first step for all stakeholders involved. While rules-based algorithms are not the glamorous side of data science, they need not be a dead end and can form the basis for increasingly sophisticated labeled data. In this talk, we propose a path to iteratively bootstrap a supervised machine learning model out of heuristics and demonstrate its potential on the basis of N26 projects.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

idalab seminar #6 ‘Lean’ Training Data: An Incremental Approach to Supervised Machine Learning

  1. 1. Jeremiah Lewis and Edouard Malet Agency for Data Science Machine learning & AI Mathematical modelling Data strategy ‘Lean’ Training Data: An incremental approach to supervised machine learning idalab seminar #6 | December 1st 2017
  2. 2. ‘Lean’ Training Data: An incremental approach to supervised machine learning N26 Data Science Edouard Malet & Jeremiah Lewis
  3. 3. Building a bank the European banking license Disrupt an industry The Smartphone is the bank of the future Mobile focus Simplicity is the best experience One click access to financial products world loves to use
  4. 4. Outstanding mobile experience
  5. 5. Building a bank the world loves to use
  6. 6. Product Stage Dataset Definition Definition of Accuracy Success criteria Beta Heuristic rules applied to random data sample Predict heuristic response Model equivalent to best non- ML alternative Model Evolution
  7. 7. LIME
  8. 8. Product Stage Dataset Definition Definition of Accuracy Success criteria Beta Heuristic rules applied to random data sample Predict heuristic response Model equivalent to best non- ML alternative Launch Optimize heuristics applied to random data sample LIME Test Model is reasonable Model Evolution
  9. 9. Gold Standard Data
  10. 10. ‘Lean’ Training Data Time Heuristic ‘Lean’ Accuracy Waterfall Revised Heuristics (LIME) Heuristic Data Gold Standard
  11. 11. Model Evolution Product Stage Dataset Definition Definition of Accuracy Success criteria Beta Heuristic rules applied to random data sample Predict heuristic response Model equivalent to best non- ML alternative Launch Optimize heuristics applied to random data sample LIME Test Model is reasonable 2.0 Gold standard dataset Predict human label Model captures subject domain with human-level accuracy
  12. 12. Thank you What questions do you have?

×