Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Class 7
Binary Classification & DecisionTree Learning
Legal Analytics
Professor Daniel Martin Katz
Professor Michael J Bomm...
< Binary Classification >
access more at legalanalyticscourse.com
http://scikit-learn.org/stable/tutorial/machine_learning_map/index.html
access more at legalanalyticscourse.com
Classification to Predict Quantity
Classification to Predict Category
Regression Methods
Trees, Forests, Knn, etc.
access mo...
Adapted from Slides By
Victor Lavrenko and Nigel Goddard
@ University of Edinburgh
Take A LookThese 12
access more at lega...
72
Female
Human
3
Female
Horse
36
Male
Human
21
Male
Human
67
Male
Human
29
Female
Human
54
Male
Human
44
Male
Human
50
Ma...
Task = Determine Whether the Agents
Will Obtain Employment?
Yes
No
f( )
Job?
Binary Classification (Supervised Learning)
ac...
Classification (Supervised Learning)
Yes
No
f( )
Job?
access more at legalanalyticscourse.com
Classification (Supervised Learning)
decision boundary
Yes
No
f( )
Job?
decision boundary
access more at legalanalyticscour...
Multi Class Classification
access more at legalanalyticscourse.com
https://www.youtube.com/watch?v=p5rTio1G4ys
Task = Determine Whether the Agents
Will Obtain a Loan?
Yes
Perhapsf( )
Loan?
Multi Class Classification (Supervised Learni...
f( )
Multi Class Classification (Supervised Learning)
Loan?
Yes
Perhaps
No
access more at legalanalyticscourse.com
f( )
Loan?
Yes
Multi Class Classification (Supervised Learning)
No
Maybe
Yes
Perhaps
No
access more at legalanalyticscourse...
Multiclass = Hyperplane
access more at legalanalyticscourse.com
Task = Determine the Age of the
Respective Agents
f( )
Age?
Regression (Supervised Learning)
#
access more at legalanalyti...
Generative
vs.
Discriminant Models
access more at legalanalyticscourse.com
access more at legalanalyticscourse.com
Follow the video
and take your
own notes
Intro to DecisionTree Learning
Classification And RegressionTree (CART)
access more at legalanalyticscourse.com
DecisionTrees in DecisionTheory
DecisionTrees in Machine Learning
≠
access more at legalanalyticscourse.com
Uses a set of binary rules applied to calculate a
target value
Used for classification (categorical variables)
or regressio...
“CART Approach”
to Decision Trees
Classification And RegressionTree (CART)
access more at legalanalyticscourse.com
https://www.youtube.com/watch?v=WOOTNBxbi8c
access more at legalanalyticscourse.com
http://www.r-bloggers.com/a-brief-tour-of-the-trees-and-forests/
access more at legalanalyticscourse.com
http://www.r-bloggers.com/classification-tree-models/
https://www.youtube.com/watch?v=_RxqyvRK0Rw&list=PLD0F06AA0D2E8FFBA
access more at legalanalyticscourse.com
Given Some Data:
(X1, Y1), ... , (Xn, Yn)
Now We Have a New Set of X’s
We Want to Predict the Y
access more at legalanalyt...
Form a BinaryTree that
Minimizes the Error
in each leaf of the tree
CART
(Classification & RegressionTrees)
access more at ...
Observe the Correspondence
Between the Data andTrees
access more at legalanalyticscourse.com
1
0
1
1
1
0
0
0
0
0
1
1 1
1
0
0
1
1
1
1
0
01
0
Xi1
Xi2
0
Adapted from Example
By Mathematical Monk
1
0
1
1
1
0
0
0
0
0
1
1 1
1
0
0
1
1
1
1
0
01
0
Xi1
Xi2
0
Adapted from Example
By Mathematical Monk
We want to build an
app...
1
0
1
1
1
0
0
0
0
0
1
1 1
1
0
0
1
1
1
1
0
01
0
Xi1
Xi2
0
Adapted from Example
By Mathematical Monk
1
0
1
1
1
0
0
0
0
0
1
1 1
1
0
0
1
1
1
1
0
01
0
Xi1
Xi2
0
Adapted from Example
By Mathematical Monk
L e t s B e g i n t o
P...
1
0
1
1
1
0
0
0
0
0
1
1 1
1
0
0
1
1
1
1
0
01
0
Xi1
Xi2
0
1 2
1
2
Adapted from Example
By Mathematical Monk
L e t s B e g i...
1
0
1
1
1
0
0
0
0
0
1
1 1
1
0
0
1
1
1
1
0
01
0
Xi1
Xi2
0
1 2
1
2
Adapted from Example
By Mathematical Monk
This Split Will...
1
0
1
1
1
0
0
0
0
0
1
1 1
1
0
0
1
1
1
1
0
01
0
Xi1
Xi2
0
1 2
1
2
Adapted from Example
By Mathematical Monk
We Ask the Ques...
1
0
1
1
1
0
0
0
0
0
1
1 1
1
0
0
1
1
1
1
0
01
0
Xi1
Xi2
0
1 2
1
2
Adapted from Example
By Mathematical Monk
If No - then we...
1
0
1
1
1
0
0
0
0
0
1
1 1
1
0
0
1
1
1
1
0
01
0
Xi1
Xi2
0
1 2
1
2
Adapted from Example
By Mathematical Monk
Here we Classif...
1
0
1
1
1
0
0
0
0
0
1
1 1
1
0
0
1
1
1
1
0
01
0
Xi1
Xi2
0
1 2
1
2
Adapted from Example
By Mathematical Monk
Using a Similar...
1
0
1
1
1
0
0
0
0
0
1
1 1
1
0
0
1
1
1
1
0
01
0
Xi1
Xi2
0
1 2
1
2
Adapted from Example
By Mathematical Monk
split 1
(a)
Xi1...
1
0
1
1
1
0
0
0
0
0
1
1 1
1
0
0
1
1
1
1
0
01
0
Xi1
Xi2
0split 1
split 2
split 3
1 2 2.2
1
2
Xi1 > 1 ?
(0,5)
Xi2 > 1.45 ?
(...
1
0
1
1
1
0
0
0
0
0
1
1 1
1
0
0
1
1
1
1
0
01
0
Xi1
Xi2
0split 1
split 2
split 3
split 4
1 2 2.2
1
2
Xi1 > 1 ?
(0,5)
Xi2 > ...
Okay Lets Add Back the ( )
which are new items
to be classified
For simplicity sake there
is one in each zone
We Will Use theTree Because
theTree Is Our Prediction Machine
1
0
1
1
1
0
0
0
0
0
1
1 1
1
0
0
1
1
1
1
0
01
0
Xi1
Xi2
0split 1
split 2
split 3
split 4
1 2 2.2
1
2
Xi1 > 1 ?
(0,5)
Xi2 > ...
1
0
1
1
1
0
0
0
0
0
1
1 1
1
0
0
1
1
1
1
0
01
0
Xi1
Xi2
0split 1
split 2
split 3
split 4
1 2 2.2
1
2
Xi1 > 1 ?
(0,5)
Xi2 > ...
1
0
1
1
1
0
0
0
0
0
1
1 1
1
0
0
1
1
1
1
0
01
0
Xi1
Xi2
0
1 2
1
2
3
0
0
0
0
1
1
1
1
1
1 10
0
0
0
1
1 1
1
1 1
0
0
1
1 1
0
A ...
In this simple example, we
eyeballed the 2D space, partitioned
it and stopped after 4 Splits
access more at legalanalytics...
Most Real Problems
are Not So Simple ...
access more at legalanalyticscourse.com
Real problems are
n-dimensional (not 2D)
(1)
access more at legalanalyticscourse.com
For real problems, you
need to select criteria
(or a criterion) for
deciding where to
partition (split) the data
(2)
acces...
For real problems you must
develop a stopping condition
or pursue recursive
partitioning of the space
(3)
access more at l...
Solutions to these 3 Problems
are among the core questions in
algorithm selection / development
access more at legalanalyt...
From an Algorithmic Perspective -
TheTask is to Develop a
Method to Partition theTrees
access more at legalanalyticscourse...
Must Do So Without Knowing
the Specific Contours of the
Data / Problem in Question
access more at legalanalyticscourse.com
So How Do We
TraverseThrough
The Data?
access more at legalanalyticscourse.com
Optimal Partitioning of Trees is
NP-Complete
access more at legalanalyticscourse.com
“Although any given solution to an NP-complete problem can
be verified quickly (in polynomial time), there is no known
effic...
key implication is that one
cannot in advance determine
the “optimal tree”
access more at legalanalyticscourse.com
Breiman, et al (1984) uses a
Greedy Optimization Method
access more at legalanalyticscourse.com
Greedy Optimization Method
is used to calculate the MLE
(maximum-likelihood estimation)
access more at legalanalyticscours...
Greedy is a Heuristic
“makes the locally optimal choice at each stage
with the hope of finding a global optimum. In
many pr...
More onTrees (and Forests)
NextTime ...
access more at legalanalyticscourse.com
Legal Analytics
Class 7 - Binary Classification with Decision Tree Learning
daniel martin katz
blog | ComputationalLegalSt...
Legal Analytics Course - Class 7 - Binary Classification with Decision Tree Learning - Professor Daniel Martin Katz + Prof...
Legal Analytics Course - Class 7 - Binary Classification with Decision Tree Learning - Professor Daniel Martin Katz + Prof...
Legal Analytics Course - Class 7 - Binary Classification with Decision Tree Learning - Professor Daniel Martin Katz + Prof...
Legal Analytics Course - Class 7 - Binary Classification with Decision Tree Learning - Professor Daniel Martin Katz + Prof...
Legal Analytics Course - Class 7 - Binary Classification with Decision Tree Learning - Professor Daniel Martin Katz + Prof...
Upcoming SlideShare
Loading in …5
×

Legal Analytics Course - Class 7 - Binary Classification with Decision Tree Learning - Professor Daniel Martin Katz + Professor Michael J Bommarito

1,596 views

Published on

Legal Analytics Course - Class 7 - Binary Classification with Decision Tree Learning - Professor Daniel Martin Katz + Professor Michael J Bommarito

Published in: Law
  • Hello! Get Your Professional Job-Winning Resume Here - Check our website! https://vk.cc/818RFv
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Legal Analytics Course - Class 7 - Binary Classification with Decision Tree Learning - Professor Daniel Martin Katz + Professor Michael J Bommarito

  1. 1. Class 7 Binary Classification & DecisionTree Learning Legal Analytics Professor Daniel Martin Katz Professor Michael J Bommarito II legalanalyticscourse.com
  2. 2. < Binary Classification > access more at legalanalyticscourse.com
  3. 3. http://scikit-learn.org/stable/tutorial/machine_learning_map/index.html access more at legalanalyticscourse.com
  4. 4. Classification to Predict Quantity Classification to Predict Category Regression Methods Trees, Forests, Knn, etc. access more at legalanalyticscourse.com
  5. 5. Adapted from Slides By Victor Lavrenko and Nigel Goddard @ University of Edinburgh Take A LookThese 12 access more at legalanalyticscourse.com
  6. 6. 72 Female Human 3 Female Horse 36 Male Human 21 Male Human 67 Male Human 29 Female Human 54 Male Human 44 Male Human 50 Male Human 42 Female Human 6 Male Dog 7 Female Human
  7. 7. Task = Determine Whether the Agents Will Obtain Employment? Yes No f( ) Job? Binary Classification (Supervised Learning) access more at legalanalyticscourse.com
  8. 8. Classification (Supervised Learning) Yes No f( ) Job? access more at legalanalyticscourse.com
  9. 9. Classification (Supervised Learning) decision boundary Yes No f( ) Job? decision boundary access more at legalanalyticscourse.com
  10. 10. Multi Class Classification access more at legalanalyticscourse.com
  11. 11. https://www.youtube.com/watch?v=p5rTio1G4ys
  12. 12. Task = Determine Whether the Agents Will Obtain a Loan? Yes Perhapsf( ) Loan? Multi Class Classification (Supervised Learning) No access more at legalanalyticscourse.com
  13. 13. f( ) Multi Class Classification (Supervised Learning) Loan? Yes Perhaps No access more at legalanalyticscourse.com
  14. 14. f( ) Loan? Yes Multi Class Classification (Supervised Learning) No Maybe Yes Perhaps No access more at legalanalyticscourse.com
  15. 15. Multiclass = Hyperplane access more at legalanalyticscourse.com
  16. 16. Task = Determine the Age of the Respective Agents f( ) Age? Regression (Supervised Learning) # access more at legalanalyticscourse.com
  17. 17. Generative vs. Discriminant Models access more at legalanalyticscourse.com
  18. 18. access more at legalanalyticscourse.com
  19. 19. Follow the video and take your own notes
  20. 20. Intro to DecisionTree Learning Classification And RegressionTree (CART) access more at legalanalyticscourse.com
  21. 21. DecisionTrees in DecisionTheory DecisionTrees in Machine Learning ≠ access more at legalanalyticscourse.com
  22. 22. Uses a set of binary rules applied to calculate a target value Used for classification (categorical variables) or regression (continuous variables) Different algorithms are used to determine the “best” split at a node Introduction to DecisionTrees access more at legalanalyticscourse.com
  23. 23. “CART Approach” to Decision Trees Classification And RegressionTree (CART) access more at legalanalyticscourse.com
  24. 24. https://www.youtube.com/watch?v=WOOTNBxbi8c access more at legalanalyticscourse.com
  25. 25. http://www.r-bloggers.com/a-brief-tour-of-the-trees-and-forests/ access more at legalanalyticscourse.com
  26. 26. http://www.r-bloggers.com/classification-tree-models/
  27. 27. https://www.youtube.com/watch?v=_RxqyvRK0Rw&list=PLD0F06AA0D2E8FFBA access more at legalanalyticscourse.com
  28. 28. Given Some Data: (X1, Y1), ... , (Xn, Yn) Now We Have a New Set of X’s We Want to Predict the Y access more at legalanalyticscourse.com
  29. 29. Form a BinaryTree that Minimizes the Error in each leaf of the tree CART (Classification & RegressionTrees) access more at legalanalyticscourse.com
  30. 30. Observe the Correspondence Between the Data andTrees access more at legalanalyticscourse.com
  31. 31. 1 0 1 1 1 0 0 0 0 0 1 1 1 1 0 0 1 1 1 1 0 01 0 Xi1 Xi2 0 Adapted from Example By Mathematical Monk
  32. 32. 1 0 1 1 1 0 0 0 0 0 1 1 1 1 0 0 1 1 1 1 0 01 0 Xi1 Xi2 0 Adapted from Example By Mathematical Monk We want to build an approach which can lead to the proper classification (labeling) of new data points ( ) that are dropped into this space
  33. 33. 1 0 1 1 1 0 0 0 0 0 1 1 1 1 0 0 1 1 1 1 0 01 0 Xi1 Xi2 0 Adapted from Example By Mathematical Monk
  34. 34. 1 0 1 1 1 0 0 0 0 0 1 1 1 1 0 0 1 1 1 1 0 01 0 Xi1 Xi2 0 Adapted from Example By Mathematical Monk L e t s B e g i n t o Partition the Space
  35. 35. 1 0 1 1 1 0 0 0 0 0 1 1 1 1 0 0 1 1 1 1 0 01 0 Xi1 Xi2 0 1 2 1 2 Adapted from Example By Mathematical Monk L e t s B e g i n t o Partition the Space split 1 (a)
  36. 36. 1 0 1 1 1 0 0 0 0 0 1 1 1 1 0 0 1 1 1 1 0 01 0 Xi1 Xi2 0 1 2 1 2 Adapted from Example By Mathematical Monk This Split Will Be Memorialized in theTree split 1 (a)
  37. 37. 1 0 1 1 1 0 0 0 0 0 1 1 1 1 0 0 1 1 1 1 0 01 0 Xi1 Xi2 0 1 2 1 2 Adapted from Example By Mathematical Monk We Ask the Question is Xi1 > 1 ? - with a binary (yes or no) response split 1 (a) Xi1 > 1 ? YesNo
  38. 38. 1 0 1 1 1 0 0 0 0 0 1 1 1 1 0 0 1 1 1 1 0 01 0 Xi1 Xi2 0 1 2 1 2 Adapted from Example By Mathematical Monk If No - then we are in zone (a) ... we tally the number of zeros and ones Using Majority Rule do we assign a classification to this rule this leaf split 1 (a) Xi1 > 1 ? YesNo (0,5) Classify as 1 zone (a)
  39. 39. 1 0 1 1 1 0 0 0 0 0 1 1 1 1 0 0 1 1 1 1 0 01 0 Xi1 Xi2 0 1 2 1 2 Adapted from Example By Mathematical Monk Here we Classify as a 1 because (0,5) which is 0 zero’s and 5 one’s split 1 (a) Xi1 > 1 ? YesNo (0,5) Classify as 1 zone (a)
  40. 40. 1 0 1 1 1 0 0 0 0 0 1 1 1 1 0 0 1 1 1 1 0 01 0 Xi1 Xi2 0 1 2 1 2 Adapted from Example By Mathematical Monk Using a Similar Approach Lets Begin to Fill in the Rest of theTree split 1 (a) Xi1 > 1 ? YesNo (0,5) Classify as 1 zone (a)
  41. 41. 1 0 1 1 1 0 0 0 0 0 1 1 1 1 0 0 1 1 1 1 0 01 0 Xi1 Xi2 0 1 2 1 2 Adapted from Example By Mathematical Monk split 1 (a) Xi1 > 1 ? YesNo (0,5) Classify as 1 zone (a) Xi2 > 1.45 ? No Yes split 2
  42. 42. 1 0 1 1 1 0 0 0 0 0 1 1 1 1 0 0 1 1 1 1 0 01 0 Xi1 Xi2 0split 1 split 2 split 3 1 2 2.2 1 2 Xi1 > 1 ? (0,5) Xi2 > 1.45 ? (4,1)(2,3) Classify as 1 Classify as 1 Classify as 0 (a) zone (a) 1.45 YesNo Adapted from Example By Mathematical Monk No (b) (c) zone (b) zone (c) YesNo Yes Xi1 > 2 ?
  43. 43. 1 0 1 1 1 0 0 0 0 0 1 1 1 1 0 0 1 1 1 1 0 01 0 Xi1 Xi2 0split 1 split 2 split 3 split 4 1 2 2.2 1 2 Xi1 > 1 ? (0,5) Xi2 > 1.45 ? Xi1 > 2.2 ? (1,4)(5,0)(4,1)(2,3) Classify as 1 Classify as 1 Classify as 0 (a) zone (a) 1.45 YesNo Adapted from Example By Mathematical Monk No (b) (c) (d) (e) zone (b) zone (c) YesNo YesNo Yes zone (d) Classify as 0 Classify as 1 zone (e) Xi1 > 2 ?
  44. 44. Okay Lets Add Back the ( ) which are new items to be classified
  45. 45. For simplicity sake there is one in each zone
  46. 46. We Will Use theTree Because theTree Is Our Prediction Machine
  47. 47. 1 0 1 1 1 0 0 0 0 0 1 1 1 1 0 0 1 1 1 1 0 01 0 Xi1 Xi2 0split 1 split 2 split 3 split 4 1 2 2.2 1 2 Xi1 > 1 ? (0,5) Xi2 > 1.45 ? Xi1 > 2.2 ? (1,4)(5,0)(4,1)(2,3) Classify as 1 Classify as 1 Classify as 0 (a) zone (a) 1.45 YesNo Adapted from Example By Mathematical Monk No (b) (c) (d) (e) zone (b) zone (c) YesNo YesNo Yes zone (d) Classify as 0 Classify as 1 zone (e) Xi1 > 2 ?
  48. 48. 1 0 1 1 1 0 0 0 0 0 1 1 1 1 0 0 1 1 1 1 0 01 0 Xi1 Xi2 0split 1 split 2 split 3 split 4 1 2 2.2 1 2 Xi1 > 1 ? (0,5) Xi2 > 1.45 ? Xi1 > 2.2 ? (1,4)(5,0)(4,1)(2,3) Classify as 1 Classify as 1 Classify as 0 (a) zone (a) 1.45 YesNo Adapted from Example By Mathematical Monk No (b) (c) (d) (e) zone (b) zone (c) Yes No YesNo Yes zone (d) Classify as 0 Classify as 1 zone (e) 1 1 1 0 1 0 Xi1 > 2 ?
  49. 49. 1 0 1 1 1 0 0 0 0 0 1 1 1 1 0 0 1 1 1 1 0 01 0 Xi1 Xi2 0 1 2 1 2 3 0 0 0 0 1 1 1 1 1 1 10 0 0 0 1 1 1 1 1 1 0 0 1 1 1 0 A B C D E F G How about this one?
  50. 50. In this simple example, we eyeballed the 2D space, partitioned it and stopped after 4 Splits access more at legalanalyticscourse.com
  51. 51. Most Real Problems are Not So Simple ... access more at legalanalyticscourse.com
  52. 52. Real problems are n-dimensional (not 2D) (1) access more at legalanalyticscourse.com
  53. 53. For real problems, you need to select criteria (or a criterion) for deciding where to partition (split) the data (2) access more at legalanalyticscourse.com
  54. 54. For real problems you must develop a stopping condition or pursue recursive partitioning of the space (3) access more at legalanalyticscourse.com
  55. 55. Solutions to these 3 Problems are among the core questions in algorithm selection / development access more at legalanalyticscourse.com
  56. 56. From an Algorithmic Perspective - TheTask is to Develop a Method to Partition theTrees access more at legalanalyticscourse.com
  57. 57. Must Do So Without Knowing the Specific Contours of the Data / Problem in Question access more at legalanalyticscourse.com
  58. 58. So How Do We TraverseThrough The Data? access more at legalanalyticscourse.com
  59. 59. Optimal Partitioning of Trees is NP-Complete access more at legalanalyticscourse.com
  60. 60. “Although any given solution to an NP-complete problem can be verified quickly (in polynomial time), there is no known efficient way to locate a solution in the first place; indeed, the most notable characteristic of NP-complete problems is that no fast solution to them is known.That is, the time required to solve the problem using any currently known algorithm increases very quickly as the size of the problem grows”
  61. 61. key implication is that one cannot in advance determine the “optimal tree” access more at legalanalyticscourse.com
  62. 62. Breiman, et al (1984) uses a Greedy Optimization Method access more at legalanalyticscourse.com
  63. 63. Greedy Optimization Method is used to calculate the MLE (maximum-likelihood estimation) access more at legalanalyticscourse.com
  64. 64. Greedy is a Heuristic “makes the locally optimal choice at each stage with the hope of finding a global optimum. In many problems, a greedy strategy does not in general produce an optimal solution, but nonetheless a greedy heuristic may yield locally optimal solutions that approximate a global optimal solution in a reasonable time.” access more at legalanalyticscourse.com
  65. 65. More onTrees (and Forests) NextTime ... access more at legalanalyticscourse.com
  66. 66. Legal Analytics Class 7 - Binary Classification with Decision Tree Learning daniel martin katz blog | ComputationalLegalStudies corp | LexPredict michael j bommarito twitter | @computational blog | ComputationalLegalStudies corp | LexPredict twitter | @mjbommar more content available at legalanalyticscourse.com site | danielmartinkatz.com site | bommaritollc.com

×