Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Machine learning

142 views

Published on

An introduction to Machine Learning and we used it at FreshBooks to automatically categorize our customers' expenses. Presented at the November 2015 ExploreTech Toronto meetup by Alex Vermeulen & Tobi Ogunbiyi

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Machine learning

  1. 1. Machine Learning Tobi Ogunbiyi & Alex Vermeulen 1
  2. 2. What is Machine Learning? A computer program is said to learn if its measured performance on a task improves with experience. 2
  3. 3. • Google’s self-driving car • Optical Character Recognition (OCR) • Google street view • Facebook Machine Learning Applications 3
  4. 4. Supervised Learning The machine is “trained” using examples for which we know the correct answer. - Labeled data - Used for classification or prediction 4
  5. 5. • Features: shape, size, colour, and sound • Labels: “cow”, “pig”, “chicken”, “llama” Supervised Learning Example 5
  6. 6. Unsupervised Learning Tries to find patterns and groupings by analyzing the characteristics of the data • Unlabeled data • Identifies patterns and groupings in the data 6
  7. 7. • No labels, just looking to group similar animals • Features: shape, size, colour, and sound Unsupervised Learning 7
  8. 8. Cloud accounting software designed for small, service-based businesses 8
  9. 9. 9 330 expenses per month 8 hours per year Story Telling 7 seconds per expense -
  10. 10. How can Machine Learning Help? • Simple & Painless • Create a better user experience • Save customers time and let them do what they do best! 10
  11. 11. How did we get there? 11
  12. 12. 12 Data Collection Pre-Processing Sampling Build Model Refinement Evaluation Prediction
  13. 13. 12 Data Collection Pre-Processing Sampling Build Model Refinement Evaluation Prediction
  14. 14. 12 Data Collection Pre-Processing Sampling Build Model Refinement Evaluation Prediction Categorized expenses from the last year
  15. 15. 12 Data Collection Pre-Processing Sampling Build Model Refinement Evaluation Prediction
  16. 16. 13 #5795# QTH Toronto ON #991# Toronto ON Roots #130 Etobicoke ON True North Climbing Toronto ON Tim Hortons Eddie Bauer Canada M9C Pre Processing
  17. 17. 13 #5795# QTH Toronto ON #991# Toronto ON Roots #130 Etobicoke ON True North Climbing Toronto ON Tim Hortons Eddie Bauer Canada M9C Pre Processing
  18. 18. 13 QTH Toronto ON Toronto ON Roots Etobicoke ON True North Climbing Toronto ON Tim Hortons Eddie Bauer Canada Pre Processing
  19. 19. 14 Tim Hortons QTH Toronto ON Vectorize (i) (ii) Eddie Bauer Canada Toronto ON
  20. 20. 14 Tim Hortons QTH Toronto ON Vectorize “tim” “hortons” “qth” “toronto” “on” “eddie” “bauer” “canada” (i) (ii) Eddie Bauer Canada Toronto ON
  21. 21. 14 Tim Hortons QTH Toronto ON Vectorize “tim” “hortons” “qth” “toronto” “on” “eddie” “bauer” “canada” (i) (i) (ii) Eddie Bauer Canada Toronto ON - - - - - - - -
  22. 22. 14 Tim Hortons QTH Toronto ON Vectorize “tim” “hortons” “qth” “toronto” “on” “eddie” “bauer” “canada” (i) (i) (ii) Eddie Bauer Canada Toronto ON - - - - - - - -
  23. 23. 14 Tim Hortons QTH Toronto ON Vectorize “tim” “hortons” “qth” “toronto” “on” “eddie” “bauer” “canada” (i) (i) 1 - - - - - - - (ii) Eddie Bauer Canada Toronto ON
  24. 24. 14 Tim Hortons QTH Toronto ON Vectorize “tim” “hortons” “qth” “toronto” “on” “eddie” “bauer” “canada” (i) (i) 1 - - - - - - - (ii) Eddie Bauer Canada Toronto ON
  25. 25. 14 Tim Hortons QTH Toronto ON Vectorize “tim” “hortons” “qth” “toronto” “on” “eddie” “bauer” “canada” (i) (i) (ii) Eddie Bauer Canada Toronto ON 1 1 - - - - - -
  26. 26. 14 Tim Hortons QTH Toronto ON Vectorize “tim” “hortons” “qth” “toronto” “on” “eddie” “bauer” “canada” (i) 1 1 1 1 1 0 0 0 (i) (ii) Eddie Bauer Canada Toronto ON
  27. 27. 14 Tim Hortons QTH Toronto ON Vectorize “tim” “hortons” “qth” “toronto” “on” “eddie” “bauer” “canada” (i) 1 1 1 1 1 0 0 0 (i) (ii) (ii) Eddie Bauer Canada Toronto ON - - - - - - - -
  28. 28. 14 Tim Hortons QTH Toronto ON Vectorize “tim” “hortons” “qth” “toronto” “on” “eddie” “bauer” “canada” (i) 1 1 1 1 1 0 0 0 (i) (ii) (ii) Eddie Bauer Canada Toronto ON - - - - - - - -
  29. 29. 14 Tim Hortons QTH Toronto ON Vectorize “tim” “hortons” “qth” “toronto” “on” “eddie” “bauer” “canada” (i) 1 1 1 1 1 0 0 0 (i) (ii) (ii) Eddie Bauer Canada Toronto ON - - - - - 1 - -
  30. 30. 14 Tim Hortons QTH Toronto ON Vectorize “tim” “hortons” “qth” “toronto” “on” “eddie” “bauer” “canada” (i) 1 1 1 1 1 0 0 0 (i) (ii) (ii) Eddie Bauer Canada Toronto ON 0 0 0 1 1 1 1 1
  31. 31. 15 Transform
  32. 32. 15 Transform Term Frequency - Inverse Document Frequency = term frequency x inverse document freq.
  33. 33. 15 Transform Term Frequency - Inverse Document Frequency = term frequency x inverse document freq. A numerical statistic that reflects how important or descriptive a term is to a single document in a collection.
  34. 34. 15 Transform Term Frequency - Inverse Document Frequency = term frequency x inverse document freq. term freq. = occurrences of term in document
  35. 35. 15 Transform inverse document freq.= log total documents docs containing the term Term Frequency - Inverse Document Frequency = term frequency x inverse document freq. term freq. = occurrences of term in document
  36. 36. 16 Tf-Idf Example Tim Hortons QTH Toronto ON “tim” “hortons” “qth” “toronto” “on” “eddie” “bauer” “canada” (i) (ii) Eddie Bauer Canada Toronto ON 1 1 1 1 1 0 0 0 (i)
  37. 37. 16 Tf-Idf Example tf(tim, d1) = occurrences of term Tim Hortons QTH Toronto ON “tim” “hortons” “qth” “toronto” “on” “eddie” “bauer” “canada” (i) (ii) Eddie Bauer Canada Toronto ON 1 1 1 1 1 0 0 0 (i)
  38. 38. 16 Tf-Idf Example tf(tim, d1) = occurrences of term Tim Hortons QTH Toronto ON “tim” “hortons” “qth” “toronto” “on” “eddie” “bauer” “canada” (i) (ii) Eddie Bauer Canada Toronto ON 1 1 1 1 1 0 0 0 (i)
  39. 39. 16 Tf-Idf Example Tim Hortons QTH Toronto ON “tim” “hortons” “qth” “toronto” “on” “eddie” “bauer” “canada” (i) (ii) Eddie Bauer Canada Toronto ON 1 1 1 1 1 0 0 0 (i) tf(tim, d1) = 1
  40. 40. 16 Tf-Idf Example Tim Hortons QTH Toronto ON “tim” “hortons” “qth” “toronto” “on” “eddie” “bauer” “canada” (i) (ii) Eddie Bauer Canada Toronto ON 1 1 1 1 1 0 0 0 (i) idf(tim, D) = log total docs docs cont. term tf(tim, d1) = 1
  41. 41. 16 Tf-Idf Example Tim Hortons QTH Toronto ON “tim” “hortons” “qth” “toronto” “on” “eddie” “bauer” “canada” (i) (ii) Eddie Bauer Canada Toronto ON 1 1 1 1 1 0 0 0 (i) idf(tim, D) = log total docs docs cont. term tf(tim, d1) = 1
  42. 42. 16 Tf-Idf Example Tim Hortons QTH Toronto ON “tim” “hortons” “qth” “toronto” “on” “eddie” “bauer” “canada” (i) (ii) Eddie Bauer Canada Toronto ON 1 1 1 1 1 0 0 0 (i) idf(tim, D) = log docs cont. term 2 tf(tim, d1) = 1
  43. 43. 16 Tf-Idf Example Tim Hortons QTH Toronto ON “tim” “hortons” “qth” “toronto” “on” “eddie” “bauer” “canada” (i) (ii) Eddie Bauer Canada Toronto ON 1 1 1 1 1 0 0 0 (i) idf(tim, D) = log docs cont. term 2 tf(tim, d1) = 1
  44. 44. 16 Tf-Idf Example Tim Hortons QTH Toronto ON “tim” “hortons” “qth” “toronto” “on” “eddie” “bauer” “canada” (i) (ii) Eddie Bauer Canada Toronto ON 1 1 1 1 1 0 0 0 (i) idf(tim, D) = log 2 1 tf(tim, d1) = 1
  45. 45. 16 Tf-Idf Example = 0.301 Tim Hortons QTH Toronto ON “tim” “hortons” “qth” “toronto” “on” “eddie” “bauer” “canada” (i) (ii) Eddie Bauer Canada Toronto ON 1 1 1 1 1 0 0 0 (i) idf(tim, D) = log 2 1 tf(tim, d1) = 1
  46. 46. 16 Tf-Idf Example tfidf(tim, d1) = 1 x 0.301 = 0.301 = 0.301 Tim Hortons QTH Toronto ON “tim” “hortons” “qth” “toronto” “on” “eddie” “bauer” “canada” (i) (ii) Eddie Bauer Canada Toronto ON 1 1 1 1 1 0 0 0 (i) idf(tim, D) = log 2 1 tf(tim, d1) = 1
  47. 47. 16 Tf-Idf Example Tim Hortons QTH Toronto ON “tim” “hortons” “qth” “toronto” “on” “eddie” “bauer” “canada” (i) (ii) Eddie Bauer Canada Toronto ON 1 1 1 1 1 0 0 0 (i) .301 .301 .301 0 0 0 0 0
  48. 48. 16 Tf-Idf Example Tim Hortons QTH Toronto ON “tim” “hortons” “qth” “toronto” “on” “eddie” “bauer” “canada” (i) (ii) Eddie Bauer Canada Toronto ON 1 1 1 1 1 0 0 0 (i) .301 .301 .301 0 0 0 0 0
  49. 49. 17 Data Collection Pre-Processing Sampling Build Model Refinement Evaluation Prediction
  50. 50. 17 Data Collection Pre-Processing Sampling Build Model Refinement Evaluation Prediction
  51. 51. 17 Data Collection Pre-Processing Sampling Build Model Refinement Evaluation Prediction 80% training data
 20% testing data
  52. 52. 17 Data Collection Pre-Processing Sampling Build Model Refinement Evaluation Prediction
  53. 53. • Multinomial Logistic Regression (supervised) • Used for unordered, categorical outputs with more than 2 possible categories • Series of linear sub-models which output a real number 18 Classify
  54. 54. Eggwidth 2 2.5 3 3.5 4 4.5 Egg length 4 4.5 5 5.5 6 6.5 7 Duck egg Chicken egg Linear regression
  55. 55. Eggwidth 2 2.5 3 3.5 4 4.5 Egg length 4 4.5 5 5.5 6 6.5 7 Duck egg Chicken egg Linear regression
  56. 56. Linear regression Eggwidth 2 2.5 3 3.5 4 4.5 Egg length 4 4.5 5 5.5 6 6.5 7 Duck egg Chicken egg
  57. 57. 20 Logistic Function • Normalizes the output of the model to a score between 0 and 1
  58. 58. 20 Logistic Function 1 + e-x logistic function = 1 • Normalizes the output of the model to a score between 0 and 1
  59. 59. Output 0 0.25 0.5 0.75 1 Input -5 -4 -3 -2 -1 0 1 2 3 4 5 20 Logistic Function • Normalizes the output of the model to a score between 0 and 1
  60. 60. 21 Data Collection Pre-Processing Sampling Build Model Refinement Evaluation Prediction
  61. 61. 21 Data Collection Pre-Processing Sampling Build Model Refinement Evaluation Prediction
  62. 62. Validation • Tested against the 20% reserved testing data set • Cross validation against a completely separate set of expense data 22
  63. 63. 23 Data Collection Pre-Processing Sampling Build Model Refinement Evaluation Prediction
  64. 64. 23 Data Collection Pre-Processing Sampling Build Model Refinement Evaluation Prediction
  65. 65. Refinement • How do you know if your model is good enough? • What can you do to improve your model? • Adjust the amount of data • Clean irrelevant parts of the data • Tweak parameters of the algorithm 24
  66. 66. MeanAccuracy(%) 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Sample size 10K 100K 200K 300K 400K 500K SVM Multinomial Naive Bayes Bernoulli Naive Bayes Logistic Regression L1 Logistic Regression L2 Experiment
  67. 67. 26 Data Collection Pre-Processing Sampling Build Model Refinement Evaluation Prediction
  68. 68. 26 Data Collection Pre-Processing Sampling Build Model Refinement Evaluation Prediction
  69. 69. • Use the model to classify new, unlabeled data Prediction 27
  70. 70. Prediction 27 Tim Hortons 3335 QPS Toronto ON
  71. 71. Prediction 27 “Advertising” “Car & Truck Expenses “Meals & Entertainment” “Personal” “Rent or Lease” “Travel” “Utilities” Tim Hortons 3335 QPS Toronto ON
  72. 72. Prediction 27 “Advertising” “Car & Truck Expenses “Meals & Entertainment” “Personal” “Rent or Lease” “Travel” “Utilities” 0.020 0.018 0.710 0.230 0.003 0.011 0.008 Tim Hortons 3335 QPS Toronto ON
  73. 73. Prediction 27 “Advertising” “Car & Truck Expenses “Meals & Entertainment” “Personal” “Rent or Lease” “Travel” “Utilities” 0.020 0.018 0.710 0.230 0.003 0.011 0.008 Tim Hortons 3335 QPS Toronto ON
  74. 74. Prediction 27 “Advertising” “Car & Truck Expenses “Meals & Entertainment” “Personal” “Rent or Lease” “Travel” “Utilities” 0.020 0.018 0.710 0.230 0.003 0.011 0.008 Tim Hortons 3335 QPS Toronto ON Threshold: 0.6
  75. 75. Getting Started • What kind of data do you have? • Labeled or unlabeled? • What questions are you trying to answer? • Make predictions • Label or classify • Identify patterns or groupings 28
  76. 76. Lessons Learned • Machines are intelligent, but not magicians • It’s easy to know you’re wrong, but harder to know when you’re right • Some people prefer to have control 29
  77. 77. Take away? • Opens up new opportunities • Potential to deliver amazing user experiences • Machine Learning is fun! 30
  78. 78. Thank You. 31
  79. 79. Resources • Interested in following along with FreshBooks Learnings? • medium.com/@freshbookspd • Want to learn more about Machine Learning? • udacity.com • coursera.org • Python’s scikit-learn 32
  80. 80. More examples of our experimentation… 33
  81. 81. Model Build TimeTimetotrain(min) 0 1 2 3 4 5 6 7 8 9 10 Sample size 10K 100K 200K 300K 400K 500K SVM Multinomial Naive Bayes Bernoulli Naive Bayes Logistic Regression L1 Logistic Regression L2
  82. 82. Prediction TimeTimetopredict(ms) 0ms 10ms 20ms 30ms 40ms 50ms 60ms 70ms 80ms Sample size 10K 100K 200K 300K 400K 500K SVM Multinomial Naive Bayes Bernoulli Naive Bayes Logistic Regression L1 Logistic Regression L2
  83. 83. File Size FileSize(MB) 0MB 500MB 1,000MB 1,500MB 2,000MB 2,500MB Sample size 10K 100K 200K 300K 400K 500K SVM Multinomial Naive Bayes Bernoulli Naive Bayes Logistic Regression L1 Logistic Regression L2

×