Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

6

Share

Download to read offline

Machine Learning, hype or hit?

Download to read offline

Machine Learning 101, as presented at SAPTechEd Barcelona, nov 2016

Related Books

Free with a 30 day trial from Scribd

See all

Machine Learning, hype or hit?

  1. 1. ANP126 Machine Learning: Hype or Hit? Fred Verheul
  2. 2. Agenda 1. Introduction: Hype or Hit?! 2. Machine Learning 1. Demo, SAP ICN 2. Skill set for aspiring ML experts 3. Take-aways 2
  3. 3. Agenda 1. Introduction: Hype or Hit?! 2. Machine Learning 1. Demo, SAP ICN 2. Skill set for aspiring ML experts 3. Take-aways 3
  4. 4. Machine Learning "Field of study that gives computers the ability to learn without being explicitly programmed” (Arthur Samuel, 1959) 4
  5. 5. What is Machine Learning? 5 Computer Computer Traditional Programming Machine Learning Data Data Program Output Program Output
  6. 6. Examples: Recommender systems 6
  7. 7. Examples: Natural Language Processing 7 Siri Google Translate
  8. 8. Examples, continued… 8 SPAM- filtering Handwriting recognition
  9. 9. ML in the news: IBM Watson 9
  10. 10. ML in the news: Deepmind’s AlphaGo 10
  11. 11. ML in the news: business example 11
  12. 12. Vendor Platforms… 12
  13. 13. Tricking a neural network… 13 A cat! Surely also a cat?! More examples and explanation by Julia Evans (@b0rk)
  14. 14. Machine Learning gone wrong 14
  15. 15. Data Mining Fail (by Carina C. Zona) 15
  16. 16. Prediction is hard… 16
  17. 17. Agenda 1. Introduction: Hype or Hit?! 2. Machine Learning 1. Demo, SAP ICN 2. Skill set for aspiring ML experts 3. Take-aways 17
  18. 18. CRISP-DM: data mining process 18 ML important ML important
  19. 19. Data: terminology 19 feature target / label instance
  20. 20. Examples of ML tasks Supervised learning Regression  target is numeric Classification  target is categorical 20 Unsupervised learning Clustering Dimensionality reduction
  21. 21. Exploratory Data Analysis 21
  22. 22. Data preparation • Data Cleaning • Missing Data • Feature Engineering • Normalization • Categorical data  Numerical features • Log-based features or target • Date/time-related features • Combine features, e.g. by +, -, x, / 22
  23. 23. Modeling: so many algorithms… 23
  24. 24. ML Algorithms: by Representation Collection of candidate models/programs, aka hypothesis space 24 Decision trees Instance-based Neural networks Model ensembles
  25. 25. ML Algorithms: by Evaluation Evaluation: Quality measure for a model 25 Regression Example metric: Root Mean Squared Error RMSE = Binary classification: confusion matrix Accuracy: 8 + 971 -> 97,9% Example: medical test for a disease Positive Negative P True positives TP False Negatives FN N False positives FP True Negatives TN True Class Predicted class Accuracy: Better evaluation metrics: • Precision: 8 / (8 + 19) • Recall: 8 / (8 + 2)
  26. 26. Optimization: how the algorithm ‘learns’, depends on representation and evaluation ML Algorithms: by Optimization 26 Greedy Search, ex. of combinatorial optimization Gradient Descent (or in general: Convex Optimization) Linear Programming (or in general: Constrained/Nonlinear Optimization)
  27. 27. Algorithms by Evaluation: Heuristics • Hill climbing • Simulated Annealing • Nelder-Mead Simplex Method • Artificial Bee Colony Optimization • Genetic Algorithms • Particle Swarm Optimization • Ant Colony Optimization 27
  28. 28. Choice of ML-algorithm, considerations • Size & Dimensionality of training set • Computational efficiency • Model building, no of parameters • Eager vs lazy learning • Online vs batch • Interpretability 28
  29. 29. Evaluation: training vs test data 29 5-fold cross validation
  30. 30. Training error vs test error 30
  31. 31. Overfitting 31
  32. 32. Chebishev distance (L∞-norm: || ||∞ ) || P – Q ||∞ = max( , ) Number of moves of a King on a chessboard ;-) Manhattan distance (L1-norm: || ||1 ) || P – Q ||1 = + 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 9 Line through (2,2) and (6,5) Line y = 2 (between 2 and 6) Vertical line x = 6 (between 2 and 5) Distance metrics Euclidean distance (L2-norm: || ||2 ) || P – Q ||2 = (length of) 32 P Q Many more: Cosine distance, Edit distance (aka Levenshtein distance), …
  33. 33. Agenda 1. Introduction: Hype or Hit?! 2. Machine Learning 1. Demo, SAP ICN 2. Skill set for aspiring ML experts 3. Take-aways 33
  34. 34. Agenda 1. Introduction: Hype or Hit?! 2. Machine Learning 1. Demo, SAP ICN 2. Skill set for aspiring ML experts 3. Take-aways 34
  35. 35. So you want to be a Data Scientist? 35
  36. 36. CRISP-DM: data mining process 36
  37. 37. Hacking skills • Programming languages: • Libraries (examples): • Tensorflow, Caffe, Theano, Keras • SciPy & scikit-learn • Spark MLLib (Scala/Java/Python) 37
  38. 38. Math skills: Statistics 38 Source: http://xkcd.com/552/
  39. 39. More math skills that may be needed… 39 Calculus Linear Algebra
  40. 40. Data Science for Business • Focuses more on general principles than specific algorithms • Not math-heavy, does contain some math • O’Reilly link: http://shop.oreilly.com/product/063692 0028918.do • Book website: http://data-science-for- biz.com/DSB/Home.html 40
  41. 41. Agenda 1. Introduction: Hype or Hit?! 2. Machine Learning 1. Demo, SAP ICN 2. Skill set for aspiring ML experts 3. Take-aways 41
  42. 42. What has NOT been covered • Deep learning / Neural Networks • Specifics of ML-algorithms • Tools / Libraries / Code • SAP Products, like HANA / Predictive Analytics / Vora / … • Hardware • … 42
  43. 43. Take-aways • Goal of ML: generalize from training data (not optimization!!) • Part of ‘Data Mining Process’, not a goal in and of itself • No magic! Just some clever algorithms… • Increasingly important non-technical aspects: • Ethics • Algorithmic transparency 43
  44. 44. Thank You www.soapeople.com info@soapeople.com @SOAPEOPLE Fred Verheul Big Data Consultant +31 6 3919 2986 fred.verheul@soapeople.com
  • nourredineZaher

    May. 2, 2019
  • PaulSoriano6

    Apr. 24, 2018
  • AchimT

    Dec. 20, 2016
  • YanitsaKircheva

    Nov. 29, 2016
  • timoelliott

    Nov. 29, 2016
  • BernhardLuecke

    Nov. 29, 2016

Machine Learning 101, as presented at SAPTechEd Barcelona, nov 2016

Views

Total views

1,527

On Slideshare

0

From embeds

0

Number of embeds

13

Actions

Downloads

39

Shares

0

Comments

0

Likes

6

×