Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

DutchMLSchool. Machine Learning End-to-End

79 views

Published on

Anatomy of an Application: Machine Learning End-to-End - Main Conference: Introduction to Machine Learning.
DutchMLSchool: 1st edition of the Machine Learning Summer School in The Netherlands.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

DutchMLSchool. Machine Learning End-to-End

  1. 1. 1st edition | July 8-11, 2019
  2. 2. BigML, Inc #DutchMLSchool Anatomy of an ML Application Machine Learning End-to-End Poul Petersen CIO, BigML 2
  3. 3. BigML, Inc #DutchMLSchool Examples of ML Applications 3
  4. 4. BigML, Inc #DutchMLSchool Real-world ML Applications 4 • Should you sign that NDA? • Upload the NDA to the website • The service uses Machine Learning to decide if the terms are fair https://ndalynn.com/
  5. 5. BigML, Inc #DutchMLSchool Real-world ML Applications 5 • Gathers over 500 features about companies: • Crunchbase / Tweets / Patents / LinkedIn / etc. • Creates a label for success/failure: • IPO or acquisition = success • Bankruptcy or irrelevance = failure • Uses Machine Learning to build a model that predicts the success or failure of startups • And puts all of the information together into an investor dashboard https://preseries.com
  6. 6. BigML, Inc #DutchMLSchool ML Adoption 6 "The gap for most companies isn’t that machine learning doesn’t work, but that they struggle to actually use it” • Why? • Too much focus on algorithms • Not enough focus on applying Machine
  7. 7. BigML, Inc #DutchMLSchool Real-world ML Applications 7 https://thepointsguy.com/news/this-is-the-reason-you-arent-feeling-as-much-turbulence-on-delta-flights/ …collecting and analyzing “hundreds of thousands of data points,” with a plan to boost that to “millions,” creating a model that forecasts turbulence with a level of confidence heretofore unseen. Not Important: the algorithm!
  8. 8. BigML, Inc #DutchMLSchool Machine Learning Evolution 8 Genesis Custom built Product Service Utility Academics & Researchers Scientists Developers Analysts Everyone 1950s 2000s 2011 2030 Commodity 2020 Ubiquity CertaintyUnknown Defined NovelCommon Weka, Scikit BigML, Azure ML, Amazon ML, Google Cloud ML1st Workshop on Machine Learning 1980 1980 • Machine Learning algorithms are fun to talk about: GPUs, NNs, etc • But the algorithms are largely a commodity already • Difficulty is knowing how to apply ML
  9. 9. BigML, Inc #DutchMLSchool What is an ML Application 9 AIRLINE ORIGIN DESTINATION DEPARTURE DELAY DISTANCE ARRIVAL DELAY AS ANC SEA -11 1448,0 -22 AA LAX PBI -8 2330,0 -9 US SFO CLT -2 2296,0 5 AA LAX MIA -5 2342,0 -9 AS SEA ANC -1 1448,0 -21 DL SFO MSP -5 1589 8 NK LAS MSP -6 1299 -17 US LAX CLT 14 2125,0 -10 AA SFO DFW -11 1464,0 -13 DL LAS ATL 3 1747,0 -15 Finding patterns in data that can be used to make inferences… Predictive Models Consider: ML Definition
  10. 10. BigML, Inc #DutchMLSchool What is an ML Application 10 AIRLINE ORIGIN DESTINATION DEPARTURE DELAY DISTANCE ARRIVAL DELAY AS ANC SEA -11 1448,0 -22 AA LAX PBI -8 2330,0 -9 US SFO CLT -2 2296,0 5 AA LAX MIA -5 2342,0 -9 AS SEA ANC -1 1448,0 -21 DL SFO MSP -5 1589 8 NK LAS MSP -6 1299 -17 US LAX CLT 14 2125,0 -10 AA SFO DFW -11 1464,0 -13 DL LAS ATL 3 1747,0 -15 Predictive Models • Where does this data come from? • How do you know what data? • Is the data formatted correctly? • What do you do with these models? • How do you combine them? • Will it work?
  11. 11. BigML, Inc #DutchMLSchool Reality of a ML Application 11 Data Transformations Feature Engineering Data Collection Evaluation & Retraining Seen Unseen Predictive App
  12. 12. BigML, Inc #DutchMLSchool Where to Start? 12 Step 1 Finish Step 2 - - - - - - - - ??? “Let’s predict 
 customer churn!” “Here are the customers we predict will leave our service”
  13. 13. BigML, Inc #DutchMLSchool Where to Start? 13 Step 1 Finish Step 2 - - - - - - - - ??? “Let’s detect 
 fraud! “Here are the transactions we should stop immediately.
  14. 14. BigML, Inc #DutchMLSchool ML Application Guide 14 • Remember: ML finds patterns in data enabling predictions about future events • This means you need data • What data depends on what you want to predict • And the data you have or can collect • Data needs to have patterns related to what you want to predict • Not magic: still can’t predict random events, lotteries, etc • Your problem statement needs to be specific • Not “Let’s predict churn” • But “Let’s predict churn by looking at the profile data of all previous customers of our service who have/have not churned” • This can be tricky… State the problem as an ML Task
  15. 15. BigML, Inc #DutchMLSchool Where to Start? 15 Step 1 Finish “Let’s predict 
 the Oscars!” “Here are the 
 predicted winners” Step 2 - - - - - - - - ??? • Statement is not specific enough!!! • What data can we collect that predicts Oscar wins?
  16. 16. BigML, Inc #DutchMLSchool Predicting the Oscars 16 • 6 out of 6 right! • 8 out of 8 actually, but probability of the predictions was “too low” • Adapted Screenplay • Original Screenplay BigML Scoresheet 2018 • 4 our of 8 major awards correctly predicted • Probabilities were lower this year • This is still significantly better than guessing 2019 How is this possible? Isn't the winner random?
  17. 17. BigML, Inc #DutchMLSchool How an Oscar is Won 17 voting intention? 7,000+ members Insight: winning awards is not a random event!
  18. 18. BigML, Inc #DutchMLSchool Let’s Predict Best Picture 18 Win London Critics Lose Writers Guild Win Directors Guild Win Golden Win Bafta • These events are *not* independent • Similar, but not identical, factors contribute to each win… • We can expect a higher probability for Shape of Water to win Oscar ?Win?
  19. 19. BigML, Inc #DutchMLSchool The Features 19 MOVIES AWARDS OBJECTIVE • year • movie • movie_id • certificate • duration • genre • rate • metascore • synopsis • votes • gross • release_date • user_reviews • critic_reviews • popularity • awards_wins • awards_nomination s • release_date.year • release_date.mont h • release_date.day- of-month • release_date.day- of-week • Oscar_Best_Picture_nominated • Oscar_Best_Director_nominated • Oscar_Best_Actor_nominated • Oscar_Best_Actress_nominated • Oscar_Best_Supporting_Actor_nominated • Oscar_Best_Supporting_Actress_nominated • Oscar_Best_AdaScreen_nominated • Oscar_Best_OriScreen_nominated • Oscar_nominated • Oscar_nominated_categories • Golden_Globes_won • Golden_Globes_won_categories • Golden_Globes_nominated • Golden_Globes_nominated_categories • BAFTA_won • BAFTA_won_categories • BAFTA_nominated • BAFTA_nominated_categories • Screen_Actors_Guild_won • Screen_Actors_Guild_won_categories • Screen_Actors_Guild_nominated • Screen_Actors_Guild_nominated_categories • Critics_Choice_won • Critics_Choice_won_categories • Critics_Choice_nominated • Critics_Choice_nominated_categories • Directors_Guild_won • Directors_Guild_won_categories • Directors_Guild_nominated • Directors_Guild_nominated_categories • Producers_Guild_won • Producers_Guild_won_categories • Producers_Guild_nominated • Producers_Guild_nominated_categories • Art_Directors_Guild_won • Art_Directors_Guild_won_categories • Art_Directors_Guild_nominated • Art_Directors_Guild_nominated_categories • Writers_Guild_won • Writers_Guild_won_categories • Writers_Guild_nominated • Writers_Guild_nominated_categories • Costume_Designers_Guild_won • Costume_Designers_Guild_won_categories • Costume_Designers_Guild_nominated • Costume_Designers_Guild_nominated_categories • Online_Film_Television_Association_won • Online_Film_Television_Association_won_categories • Online_Film_Television_Association_nominated • Online_Film_Television_Association_nominated_catego ries • Online_Film_Critics_Society_won • Online_Film_Critics_Society_won_categories • Online_Film_Critics_Society_nominated • Online_Film_Critics_Society_nominated_categories • People_Choice_won • People_Choice_won_categories • People_Choice_nominated • People_Choice_nominated_categories • London_Critics_Circle_Film_won • London_Critics_Circle_Film_won_categories • London_Critics_Circle_Film_nominated • London_Critics_Circle_Film_nominated_categories • American_Cinema_Editors_won • American_Cinema_Editors_won_categories • American_Cinema_Editors_nominated • American_Cinema_Editors_nominated_categories • Hollywood_Film_won • Hollywood_Film_won_categories • Hollywood_Film_nominated • Hollywood_Film_nominated_categories • Austin_Film_Critics_Association_won • Austin_Film_Critics_Association_won_categories • Austin_Film_Critics_Association_nominated • Austin_Film_Critics_Association_nominated_categories • Denver_Film_Critics_Society_won • Denver_Film_Critics_Society_won_categories • Denver_Film_Critics_Society_nominated • Denver_Film_Critics_Society_nominated_categories • Boston_Society_of_Film_Critics_won • Boston_Society_of_Film_Critics_won_categories • Boston_Society_of_Film_Critics_nominated • Boston_Society_of_Film_Critics_nominated_categories • New_York_Film_Critics_Circle_won • Oscar_Best_Picture_wo n • Oscar_Best_Director_w on • Oscar_Best_Actor_won • Oscar_Best_Actress_wo n • Oscar_Best_Supporting _Actor_won • Oscar_Best_Supporting _Actress_won Data pulled from IMDB… Engineered Features: Award items field Nomination Counts Awards Counts
  20. 20. BigML, Inc #DutchMLSchool Oscars Dataset 20 DATASET is publicly available: https://bigml.com/user/academy_awards/gallery/dataset/ 5a94302592fb565ed400103b
  21. 21. BigML, Inc #DutchMLSchool Oscars Example 21 • When specifying the problem, be as specific as possible • Not: “Let’s predict the Oscars” • Instead: “Let’s Predict the Oscars by correlating a series of award wins with the final Oscar win.” • The statement of the problem will guide the data required • Be aware of the cost of collecting the data versus the ROI: Tidbits and Lessons Learned….
  22. 22. BigML, Inc #DutchMLSchool Ranking ML Applications 22 FEASIBILITY (incdataavailability/deccomplexity) ROI (impact and cost) - + + NO-BRAINERS START HERE NO-GO POSTPONABLE BRAINERS Thinking about an ML Application?
  23. 23. BigML, Inc #DutchMLSchool Oscars Example 23 • When specifying the problem, be as specific as possible • Not: “Let’s predict the Oscars” • Instead: “Let’s Predict the Oscars by correlating a series of award wins with the final Oscar win.” • The statement of the problem will guide the data required • Be aware of the cost of collecting the data versus the ROI: • IMDB data is readily availble • We’re done right? • Nope. You can’t escape Feature Engineering • Items: BAFTA_won_categories = list of nominations • Aggregations: Nomination and Award counts • You can’t escape Feature Selection • Full user reviews costly to collect and not useful Tidbits and Lessons Learned…. Wait: How were you confident in the predictions?
  24. 24. BigML, Inc #DutchMLSchool 2013 2016 119 variables Evaluating the Model 24 119 variables 2000 2016 119 variables 2000 2012Original Dataset Test Dataset Train Dataset • Ultimately, we want to use all the history to predict the winner for the current year • In order to evaluate success, we use a model built from 2000-2012 data to predict the winners for 2013-2016 • Built a separate Deepnet for each award category • Evaluation obtained a ROC AUC over 0.98 across all award categories Great: The model seems OK, what next?
  25. 25. BigML, Inc #DutchMLSchool Effort of a ML Application 25 State the problem as an ML task Data wrangling Feature engineering Modeling and Evaluations Predictions Measure Results Data transformations ~80% effort ~5% effort ~5% effort This is only such low effort because of platforms like This is an area where is currently innovating Task ~10% effort Effort
  26. 26. BigML, Inc #DutchMLSchool Reality Check 26 • All Machine Learned models are wrong • Real-world Machine Learning is iterative • End-to-end Machine Learning is compositional Three Important Concepts in Applying ML…
  27. 27. BigML, Inc #DutchMLSchool End-to-end ML is Compositional 27 • Real-world problems • Solved by applying a combination of algorithms • Very rarely is it one-and-done
  28. 28. BigML, Inc #DutchMLSchool Basic Workflow 28 SOURCE DATASET MODEL PREDICTION
  29. 29. BigML, Inc #DutchMLSchool Feature Engineering 29 MODEL FILTERSOLD HOMES BATCH PREDICTION NEW FEATURES DATASET DEALS DATASET FILTERFORSALE HOMES NEW FEATURES
  30. 30. BigML, Inc #DutchMLSchool End-to-end ML is Compositional 30 • Real-world problems • Solved by applying a combination of algorithms • Very rarely is it one-and-done • Each “step” is often multi-stage as well • Filtering/Cleaning data
  31. 31. BigML, Inc #DutchMLSchool Anomaly Filter and Evaluate 31 DIABETES SOURCE DIABETES DATASET TRAIN SET TEST SET ALL MODEL CLEAN DATASET FILTER ALL MODEL ALL EVALUATION CLEAN EVALUATION COMPARE EVALUATIONS ANAOMALY DETECTOR
  32. 32. BigML, Inc #DutchMLSchool Fixing Missing Values 32 Fix Missing Values in a “Meaningful” Way Filter Zeros Model 
 insulin Predict 
 insulin Select 
 insulin Fixed
 Dataset Amended
 Dataset Original
 Dataset Clean
 Dataset
  33. 33. BigML, Inc #DutchMLSchool End-to-end ML is Compositional 33 • Real-world problems • Solved by applying a combination of algorithms • Very rarely is it one-and-done • Each “step” is often multi-stage as well • Filtering/Cleaning data • Tuning a model for optimum performance
  34. 34. BigML, Inc #DutchMLSchool Ensemble Tuning 34 ENSEMBLE N=20 EVALUATION SOURCE DATASET TRAINING TEST EVALUATIONEVALUATION ENSEMBLE N=10 ENSEMBLE N=1000 CHOOSE
  35. 35. BigML, Inc #DutchMLSchool End-to-end ML is Compositional 35 • Real-world problems • Solved by applying a combination of algorithms • Very rarely is it one-and-done • Each “step” is often multi-stage as well • Filtering/Cleaning data • Tuning a model for optimum performance • Finding the best features
  36. 36. BigML, Inc #DutchMLSchool Best-first Feature Selection 36 {F1} CHOOSE BEST S = {Fa} {F2} {F3} {F4} Fn S+{F1} S+{F2} S+{F3} S+{F4} S+{Fn-1} CHOOSE BEST S = {Fa, Fb} S+{F1} S+{F2} S+{F3} S+{F4} S+{Fn-1} CHOOSE BEST S = {Fa, Fb, Fc}
  37. 37. BigML, Inc #DutchMLSchool End-to-end ML is Compositional 37 • Real-world problems • Solved by applying a combination of algorithms • Very rarely is it one-and-done • Each “step” is often multi-stage as well • Filtering/Cleaning data • Tuning a model for optimum performance • Finding the best features • May require models for several domains of knowledge • Multiple Training / Scoring
  38. 38. BigML, Inc #DutchMLSchool AGGREGATED BY CARD AGGREGATED BY USER AGGREGATED BY PROFILE Multiple Domains 38 TRANSACTIONS ANOMALY BY CARD ANOMALY BY USER ANOMALY BY PROFILE ANOMALY SCORE ANOMALY SCORE ANOMALY SCORE NEW TRANSACTION APPROVED?
  39. 39. BigML, Inc #DutchMLSchool End-to-end ML is Compositional 39 • Real-world problems • Solved by applying a combination of algorithms • Very rarely is it one-and-done • Each “step” is often multi-stage as well • Filtering/Cleaning data • Tuning a model for optimum performance • Finding the best features • May require models for several domains of knowledge • Multiple Training / Scoring • Even after deploying a model • Workflow to monitor performance, know when to retrain
  40. 40. BigML, Inc #DutchMLSchool Model Retraining 40 TRAINING INPUT DATA PREDICTIONS ANOMALY SCORES OUTCOMES RETRAIN DATA
  41. 41. BigML, Inc #DutchMLSchool Reality Check 41 • All Machine Learned models are wrong Three Important Concepts in Applying ML… • Real-world Machine Learning is iterative • End-to-end Machine Learning is compositional
  42. 42. BigML, Inc #DutchMLSchool • Better features always beat better algorithms • Good algorithms already exist and are good enough • Tools like OptiML exist which can help optimize performance • The data is never good enough Tenets of Machine Learning 42 • All Machine Learned models are wrong • Real-world Machine Learning is iterative • End-to-end Machine Learning is compositional • Automation is better than hand tuning - you need an API! • When data changes quickly, training speed is more important than accuracy • Repeatability is superior to a single strong result • Problems are solved with workflows of algorithms • A ML solution is not real until it is in production • ML is here: Now we need 100,000x people applying ML , but some are useful
  43. 43. Co-organized by: Sponsor: Business Partners:

×