Successfully reported this slideshow.
Upcoming SlideShare
×

# One or the Other: An Overview of Binary Classification Methods

3,032 views

Published on

One or the Other: An Overview of Binary Classification Methods
Presented at Toronto Data Science Group
February 11, 2016

Myles Harrison
myles@mylesharrison.com

Published in: Data & Analytics
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Hi Myles. I couldn't make this meetup. Will review the slides. Thanks for posting.

Are you sure you want to  Yes  No

### One or the Other: An Overview of Binary Classification Methods

1. 1. One or the Other: Toronto Data Science Group Thursday, February 11, 2016 @everydayanalyst Myles Harrison www.everydayanalytics.ca myles@mylesharrison.com An Overview of Binary Classification Methods
2. 2. (a quick review)
3. 3. [x11, x12, x13, ....., x1n] [x21, x22, x23, ....., x2n] [x31, x32, x33, ....., x3n] [x41, x42, x43, ....., x4n] [xm1, xm2, xm3, ....., xmn] [x1, x2, x3, x4, ..., xn] Binary Classification ELI5 ? ........
4. 4. Cross-validation K-fold where k=3 1 2 3
5. 5. Decision Trees
6. 6. hair past shoulders? Y N Y N N N Y N glasses? Y Y Y Y Y N N Y Netflix binges? Y Y Y Y Y Y Y Y
7. 7. = –p(man)log2(p(man)) – p(woman)log2(p(woman)) = –0.5log2(0.5) – 0.5log2(0.5) = -0.5(-1) – 0.5(-1) = 0.5 + 0.5 = 1
8. 8. Y N 0 0.72 0 0.92 1 0 hair past shoulders? glasses? Netflix binges?
9. 9. Information Gain: Entropy Before – Entropy After = entropy(parent) – [p(group1)*entropy(group1) + p(group2)*entropy(group2) = 1 – [1(1) – 0] = 0 1 0 Netflix binges?
10. 10. Y N 0 0.72 0 0.92 1 0 hair past shoulders? glasses? Netflix binges? IG = 0.55 IG = 0.31 IG = 0
11. 11. hair past shoulders? glasses? Y N Y N Y N
12. 12. [x11, x12, ...., x1n] Bagging & Random Forests
13. 13. 0.90 0.85 0.95 0.93 0.80 0.88
14. 14. Support Vector Machines (SVMs)
15. 15. ξi ξi Hinge loss:
16. 16. ϕ
17. 17. 0.85 0.93 0.97 0.97 0.47 0.90
18. 18. Nearest Neighbo(u)rs (kNN)
19. 19. ?
20. 20. Distance Metrics
21. 21. p1 p2 d x y (x1, y1) (x2, y2)Euclidean Distance
22. 22. A B d = x + y d = |x2-x1| + |y2-y1| d = ||p1-p2||1
23. 23. p1 p2 θ
24. 24. k = 3 k = 5
25. 25. 0.90 0.82 0.90 0.88 0.80 0.97
26. 26. ?
27. 27. http://www.everydayanalytics.ca