Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

MLSEV. Machine Learning: Technical Perspective

154 views

Published on

What is the Big Deal? - by BigML.
MLSEV 2019: 1st edition of the Machine Learning School in Seville, Spain.

Published in: Data & Analytics
  • Be the first to comment

MLSEV. Machine Learning: Technical Perspective

  1. 1. 1st edition March 7-8, 2019
  2. 2. BigML, Inc ML: Technical Perspective What is the Big Deal? Poul Petersen CIO, BigML !2
  3. 3. BigML, Inc #MLSEV: ML a Technical Perspective Sampling the Audience !3 Expert: Published papers at KDD, ICML, NIPS, etc or developed own ML algorithms used at large scale Aficionado: Understands pros/cons of different techniques and/or can tweak algorithms as needed Practitioner: Very familiar with ML packages (Weka, Scikit, BigML, etc.) Newbie: Just taking Coursera ML class or reading an introductory book to ML Absolute beginner: ML sounds like science fiction
  4. 4. BigML, Inc #MLSEV: ML a Technical Perspective A Present for You !4
  5. 5. BigML, Inc #MLSEV: ML a Technical Perspective Free 1-Month Boosted Subscription !5 https://bigml.com/accounts/register/ MLSEV
  6. 6. BigML, Inc #MLSEV What is Machine Learning? !6
  7. 7. BigML, Inc #MLSEV: ML a Technical Perspective What is Machine Learning? !7 Let’s start with what is NOT Machine Learning… • Sentience • Killer robots • Generalized Artificial Intelligence • Anything to do with the word “singularity”
  8. 8. BigML, Inc #MLSEV: ML a Technical Perspective Oh the Hype! !8 AlphaGo Zero beats a human at Go… killer robots far off? • First of all, AlphaGo Zero is impressive! • But, no need to fear killer robots power by AlphaGo Zero: • Learning is not transferrable: retrain for chess, etc. • Works only for rule based systems / perfect simulator • Relies on games/systems with clear objectives (win/lose) • Cost $25 million1 “While AlphaGo Zero is a step towards a general-purpose AI, it can only work on problems that can be perfectly simulated in a computer, making tasks such as driving a car out of the question. AIs that match humans at a huge range of tasks are still a long way off” - Demis Hassabis, CEO of DeepMind2 2. https://www.theguardian.com/science/2017/oct/18/its-able-to-create-knowledge-itself-google-unveils-ai-learns-all-on-its-own 1. https://www.inc.com/lisa-calhoun/google-artificial-intelligence-alpha-go-zero-just-pressed-reset-on-how-we-learn.html
  9. 9. BigML, Inc #MLSEV: ML a Technical Perspective Three Domains !9 Artificial Intelligence Cool/Scary things… that mostly don’t exist Machine Learning AI Concepts applied to very specific problems Deep Learning Specific techniques of Machine Learning
  10. 10. BigML, Inc #MLSEV: ML a Technical Perspective What is Machine Learning? !10 Let’s start with what is NOT Machine Learning… • Sentience • Killer robots • Generalized Artificial Intelligence • Anything to do with the word “singularity” • Something “new” • First International Conference on ML held in 1980 • Top-performing algorithms have been around for decades How do these things relate?
  11. 11. BigML, Inc #MLSEV: ML a Technical Perspective AIRLINE ORIGIN DESTINATION DEPARTURE DELAY DISTANCE ARRIVAL DELAY AS ANC SEA -11 1448,0 -22 AA LAX PBI -8 2330,0 -9 US SFO CLT -2 2296,0 5 AA LAX MIA -5 2342,0 -9 AS SEA ANC -1 1448,0 -21 DL SFO MSP -5 1589 8 NK LAS MSP -6 1299 -17 US LAX CLT 14 2125,0 -10 AA SFO DFW -11 1464,0 -13 DL LAS ATL 3 1747,0 -15 What is Machine Learning? !11 Finding patterns in data that can be used to make inferences Predictive Models A practical definition…
  12. 12. BigML, Inc #MLSEV: ML a Technical Perspective Machine Learning Terminology !12 Instances Features New Instance Predictive model Prediction Confidence ML algorithm Label Training / Learning Predicting / Scoring Data
  13. 13. BigML, Inc #MLSEV Why Machine Learning? !13
  14. 14. BigML, Inc #MLSEV: ML a Technical Perspective Why Machine Learning !14 COMPLEXITYOFTASKS TIME20th century 21st century - +
  15. 15. BigML, Inc #MLSEV: ML a Technical Perspective Traditional Programming !15 Lost Baggage Policy • Explicit rules defined by requirements and experience • How do we program when the rules are unknown or very difficult to determine?
  16. 16. BigML, Inc #MLSEV: ML a Technical Perspective Programming with ML !16 AIRLINE ORIGIN DESTINATION DEPARTURE DELAY DISTANCE ARRIVAL DELAY AS ANC SEA -11 1448,0 -22 AA LAX PBI -8 2330,0 -9 US SFO CLT -2 2296,0 5 AA LAX MIA -5 2342,0 -9 AS SEA ANC -1 1448,0 -21 DL SFO MSP -5 1589 8 NK LAS MSP -6 1299 -17 US LAX CLT 14 2125,0 -10 AA SFO DFW -11 1464,0 -13 DL LAS ATL 3 1747,0 -15 Want: Flight Delay Prediction Flight Delay Model???? What else can ML do?
  17. 17. BigML, Inc #MLSEV Machine Learning Tasks !17
  18. 18. BigML, Inc #MLSEV: ML a Technical Perspective Machine Learning Tasks !18 CLUSTER ANALYSIS ANOMALY DETECTION ASSOCIATION DISCOVERY TOPIC MODELING TIME SERIES UNSUPERVISED CLASSIFICATION AND REGRESSION SUPERVISED
  19. 19. BigML, Inc #MLSEV: ML a Technical Perspective Predictive Maintenance !19 CLASSIFICATION Will this component fail? REGRESSION How many days until this component fails? TIME SERIES FORECASTING How many components will fail in a week from now? CLUSTER ANALYSIS Which machines behave similarly? ANOMALY DETECTION Is this behavior normal? ASSOCIATION DISCOVERY What alerts are triggered together before a failure?
  20. 20. BigML, Inc #MLSEV: ML a Technical Perspective Personalized Music !20 CLASSIFICATION Will this song be a hit? REGRESSION How many users will play this song next month? TIME SERIES FORECASTING How many downloads this song will have in 3 months? CLUSTER ANALYSIS Which songs are similar? ANOMALY DETECTION Is this song being played more than normal? ASSOCIATION DISCOVERY What songs people like to play together?
  21. 21. BigML, Inc #MLSEV: ML a Technical Perspective Airline Revenue Management !21 CLASSIFICATION Will this flight be booked at 80% 14 days out? REGRESSION How many passengers will book this flight 7 days out? TIME SERIES FORECASTING How many tickets will be cancelled this week? CLUSTER ANALYSIS Which flight booking patterns are similar? ANOMALY DETECTION Are these flights booking patterns normal? ASSOCIATION DISCOVERY What price changes help overbook sooner?
  22. 22. BigML, Inc #MLSEV: ML a Technical Perspective Network Security !22 CLASSIFICATION Is this email part of a phishing attack? REGRESSION How many logins after work per week? TIME SERIES FORECASTING What will be the number of false alarms next week? CLUSTER ANALYSIS Are these users behaving similarly? ANOMALY DETECTION Is this user behavior worth to inspect? ASSOCIATION DISCOVERY What alerts were triggered before this attack?
  23. 23. BigML, Inc #MLSEV All ML Models are Wrong !23
  24. 24. BigML, Inc #MLSEV: ML a Technical Perspective All ML Models are WRONG !24 TRUE FALSE DEEPNET ENSEMBLELOGISTIC REGRESION DECISION TREE Some model(s) is wrong… which one? Same patient… different models… different predictions! Insight: Need a way to measure model fitness
  25. 25. BigML, Inc #MLSEV: ML a Technical Perspective Evaluating Models !25 TEST TRAINING CONFIDENCEPREDICTION % EVALUATION % ENSEMBLE PATIENT DATA Stay Tuned: You will see this in Evaluations
  26. 26. BigML, Inc #MLSEV: ML a Technical Perspective Measuring ML Mistakes !26 TRUE FALSE TRUE TRUE POSITIVE FALSE POSITIVE FALSE FALSE NEGATIVE TRUE NEGATIVE MODEL ACTUAL We can bend the rules a bit…
  27. 27. BigML, Inc #MLSEV: ML a Technical Perspective Operating Point !27 TRUE FALSE 100% 0% 0% 100% Operating Point More False Positives More False Negatives Why would you do this?
  28. 28. BigML, Inc #MLSEV: ML a Technical Perspective Comparing Models !28 %TRUEPOSITIVES % FALSE POSITIVES WORST(?) MODEL IDEAL MODEL GOOD BETTER R AN D O M TRIVIAL MODEL TRIVIAL MODEL
  29. 29. BigML, Inc #MLSEV: ML a Technical Perspective Mistakes can be Costly !29 + = FUN! DANGER!
  30. 30. BigML, Inc #MLSEV: ML a Technical Perspective Cost Functions !30 GOOD BETTER?%TRUEPOSITIVES % FALSE POSITIVES • What is the cost of predicting cancer incorrectly? • What is the cost of labeling a fraudulent transaction as valid? • What is the cost of incorrectly predicting an aircraft part is safe? • Why can’t I just have a perfect model? FALSE NEGATIVE COST FALSE POSITIVE COST One possibility
  31. 31. BigML, Inc #MLSEV: ML a Technical Perspective How it Goes All Wrong !31 • Over-fitting • Under-fitting
  32. 32. BigML, Inc #MLSEV: ML a Technical Perspective Hunting Dog Image Classifier !32 TRU E FAL SE Which images are pictures of dogs that are bred to be hunters?
  33. 33. BigML, Inc #MLSEV: ML a Technical Perspective Over-fitting… !33 “Hunting dogs are short- haired spotted puppies that lay out on the grass”
  34. 34. BigML, Inc #MLSEV: ML a Technical Perspective Title !34 A perfect model! How about some new images… TRU E FAL SE
  35. 35. BigML, Inc #MLSEV: ML a Technical Perspective Over-fitting !35 Model: true Reality: false Model: false Reality: true • This is an example or poor generalization • The model “fit” the training data perfectly • But it does not generalize to new instances well
  36. 36. BigML, Inc #MLSEV: ML a Technical Perspective Under-fitting !36 “Dogs with drop or pendant ears are hunters” Only use ear shape:
  37. 37. BigML, Inc #MLSEV: ML a Technical Perspective Title !37 An imperfect model… now we are making some mistakes on the training data. TRU E FAL SE
  38. 38. BigML, Inc #MLSEV: ML a Technical Perspective Under-fitting !38 • This is an example of good generalization • The model “under-fit” the training data • But it is generalizing to new instances better Model: true Reality: true Model: false Reality: false
  39. 39. BigML, Inc #MLSEV: ML a Technical Perspective Under-fitting !39 Model: false Reality: true Model: false Reality: true
  40. 40. BigML, Inc #MLSEV: ML a Technical Perspective Learning Problems / Complexity !40 Under-fitting Over-fitting • High Complexity Model • Fitting the data too well One way to mitigate this is with different types of models… • Low Complexity Model • Not fitting the data very well
  41. 41. BigML, Inc #MLSEV: ML a Technical Perspective Choosing the ML Algorithm !41 Decreasing Interpretability / Better Representation / Longer Training IncreasingDataSize/Complexity Early Stage Rapid Prototyping Mid Stage Proven Application Late Stage Critical Performance DeepnetsSingle Tree Model Logistic Regression Boosted Trees Random Decision Forest Decision Forest Hard?
  42. 42. BigML, Inc #MLSEV Automating Machine Learning !42
  43. 43. BigML, Inc #MLSEV: ML a Technical Perspective Deepnet Structure !43 x1 x2 x3 x4 y1 y2 y3Outputs Inputs h1 h2 h3 h4 h5 Hidden layer 3 Classes 4 Features h1 h2 h3 h4 h5 Hidden layer h1 h2 h3 h4 h9 Hidden layer…. h1 = activation?(wx, x) ?
  44. 44. BigML, Inc #MLSEV: ML a Technical Perspective BigML Deepnet !44 • The success of a Deepnet is dependent on getting the right network structure for the dataset • But, there are too many parameters: • Nodes, layers, activation function, learning rate, etc… • And setting them takes significant expert knowledge • Solution: Metalearning (a good initial guess) • Solution: Network search (try a bunch)
  45. 45. BigML, Inc #MLSEV: ML a Technical Perspective Automating Machine Learning !45 http://www.clparker.org/ml_benchmark/
  46. 46. BigML, Inc #MLSEV: ML a Technical Perspective Automating Machine Learning !46 • Each resource has several parameters that impact quality • Number of trees, missing splits, nodes, weight • Rather than trial and error, we can use ML to find ideal parameters • Why not make the model type, Decision Tree, Boosted Tree, etc, a parameter as well? • Similar to Deepnet network search, but finds the optimum machine learning algorithm and parameters for your data automatically Key Insight: We can solve any parameter selection problem in a similar way.
  47. 47. BigML, Inc #MLSEV: ML a Technical Perspective BigML OptiML !47
  48. 48. BigML, Inc #MLSEV: ML a Technical Perspective Fusions !48 Key Insight: ML algorithms each have unique strengths and weaknesses Single Tree: output changes abruptly with inputs near decision boundary Tree + Deepnet: output changes smoothly with inputs near decision boundary
  49. 49. BigML, Inc #MLSEV: ML a Technical Perspective Fusions !49 Model Skills: Some ML algorithms “generally” do better on some feature types: • RDF for sparse text vectors • LR/Deepnets for numeric features • Trees for categorical features Full Numeric Text
  50. 50. BigML, Inc #MLSEV: ML a Technical Perspective Summary !50 • Machine Learning is a subset of “Artificial Intelligence” • Finds patterns in data that can be used to make inferences • Can be thought of as “programming with data” • Has been around for a long time (only recently practical) • Already being used to solve real-world problems • Caveat Emptor: • Machine Learning mistakes are expected • Care must be taken to address the cost of mistakes • Automating Machine Learning • Powerful application of ML to parameterizing ML • Models can be fused to address specific data complexities

×