Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

One or the Other: An Overview of Binary Classification Methods

3,032 views

Published on

One or the Other: An Overview of Binary Classification Methods
Presented at Toronto Data Science Group
February 11, 2016

Myles Harrison
myles@mylesharrison.com

Published in: Data & Analytics

One or the Other: An Overview of Binary Classification Methods

  1. 1. One or the Other: Toronto Data Science Group Thursday, February 11, 2016 @everydayanalyst Myles Harrison www.everydayanalytics.ca myles@mylesharrison.com An Overview of Binary Classification Methods
  2. 2. (a quick review)
  3. 3. [x11, x12, x13, ....., x1n] [x21, x22, x23, ....., x2n] [x31, x32, x33, ....., x3n] [x41, x42, x43, ....., x4n] [xm1, xm2, xm3, ....., xmn] [x1, x2, x3, x4, ..., xn] Binary Classification ELI5 ? ........
  4. 4. Cross-validation K-fold where k=3 1 2 3
  5. 5. Decision Trees
  6. 6. hair past shoulders? Y N Y N N N Y N glasses? Y Y Y Y Y N N Y Netflix binges? Y Y Y Y Y Y Y Y
  7. 7. = –p(man)log2(p(man)) – p(woman)log2(p(woman)) = –0.5log2(0.5) – 0.5log2(0.5) = -0.5(-1) – 0.5(-1) = 0.5 + 0.5 = 1
  8. 8. Y N 0 0.72 0 0.92 1 0 hair past shoulders? glasses? Netflix binges?
  9. 9. Information Gain: Entropy Before – Entropy After = entropy(parent) – [p(group1)*entropy(group1) + p(group2)*entropy(group2) = 1 – [1(1) – 0] = 0 1 0 Netflix binges?
  10. 10. Y N 0 0.72 0 0.92 1 0 hair past shoulders? glasses? Netflix binges? IG = 0.55 IG = 0.31 IG = 0
  11. 11. hair past shoulders? glasses? Y N Y N Y N
  12. 12. [x11, x12, ...., x1n] Bagging & Random Forests
  13. 13. 0.90 0.85 0.95 0.93 0.80 0.88
  14. 14. Support Vector Machines (SVMs)
  15. 15. ξi ξi Hinge loss:
  16. 16. ϕ
  17. 17. 0.85 0.93 0.97 0.97 0.47 0.90
  18. 18. Nearest Neighbo(u)rs (kNN)
  19. 19. ?
  20. 20. Distance Metrics
  21. 21. p1 p2 d x y (x1, y1) (x2, y2)Euclidean Distance
  22. 22. A B d = x + y d = |x2-x1| + |y2-y1| d = ||p1-p2||1
  23. 23. p1 p2 θ
  24. 24. k = 3 k = 5
  25. 25. 0.90 0.82 0.90 0.88 0.80 0.97
  26. 26. ?
  27. 27. http://www.everydayanalytics.ca

×