Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Legal Analytics
Professor Daniel Martin Katz
Professor Michael J Bommarito II
legalanalyticscourse.com
Class 2
Machine Lea...
< What is Machine ‘Learning’?>
access more at legalanalyticscourse.com
we are trying to learn from existing
data to infer future data
rough and ready goal for ML:
access more at legalanalyticsc...
“A computer program is said to learn from
experience E with respect to some task T
and some performance measure P, if its
...
inference is tricky for
a variety of reasons
access more at legalanalyticscourse.com
data might be noisy
thus hard to infer true signal
(1)
access more at legalanalyticscourse.com
the dynamics change such that
past data is a bad guide for
future data
(2)
access more at legalanalyticscourse.com
researcher chooses wrong
method to do forecasting
(3)
access more at legalanalyticscourse.com
remember to take stock
of relevant baselines
access more at legalanalyticscourse.com
For example, how accurate
is the existing method
of forecasting?
access more at legalanalyticscourse.com
sometimes it is good but far too
often existing method(s) is not as
good as many would think
access more at legalanalytics...
There are 3 Known Ways
to Predict Something
access more at legalanalyticscourse.com
Experts, Crowds, Algorithms
access more at legalanalyticscourse.com
historically decision making
in law is heavily skewed
toward the use of
Human Experts
access more at legalanalyticscourse....
rarely is the historic
performance of
Human Experts
extensively benchmarked
or validated using
statistical methods
access ...
in related fields
this is becoming
increasingly common
access more at legalanalyticscourse.com
it is even becoming
common among
those who use
statistical and other
methods to forecast
access more at legalanalyticscour...
Poll Aggregation is one form of ensemble where
the learning question is to determine how much
weight (if any) to assign to...
selecting the
ulitmate ensemble
of polls requires
grading the
historical
performance
of each pollster
access more at legal...
poll weighting
access more at legalanalyticscourse.com
The goal of this course is
to expose you to some of
the methods used in
predictive analytics
access more at legalanalytics...
< Supervised Learning
versus
Unsupervised Learning>
*note reinforcement learning will not be covered herein
access more at...
Supervised Learning
access more at legalanalyticscourse.com
typcially supervision is
undertaken through the
provision/development
of “gold standard” data
access more at legalanalytic...
Classic Example from Law
is so called ‘predictive coding’
access more at legalanalyticscourse.com
imagine your client is served
with a request for production
access more at legalanalyticscourse.com
in random
order
assume
this is the
size
of the
hypothetical
document
set
(emails,
memos,
etc.)
we can
sample
a subset
of the
documents
we can
sample
a subset
of the
documents
classification
clustering
regression
dimension reduction
access more at legalanalyticscourse.com
classification
access more at legalanalyticscourse.com
access more at legalanalyticscourse.com
predictive coding =
~ binary classification
access more at legalanalyticscourse.com
LearningTask = Determine Whether a Given
Document is Relevant?
Relevant
Not Relevant
f( )
relevance?
Binary Classification ...
take the sample set as
a training set and
use human experts
access more at legalanalyticscourse.com
the use of the human
experts is called
“supervised learning”
access more at legalanalyticscourse.com
in the simple binary case,
ask humans to assign
objects to two piles
access more at legalanalyticscourse.com
Apply Human Coders
access more at legalanalyticscourse.com
yellow = relevant
white = non-relevant
and return this
access more at legalanalyticscourse.com
Non RelevantRelevant
access more at legalanalyticscourse.com
Non RelevantRelevant
gold standard data
access more at legalanalyticscourse.com
Key Insight ...
access more at legalanalyticscourse.com
What Allows A
Human To Separate
These Two Classes of
Documents?
access more at legalanalyticscourse.com
that precise human
process is what
“predictive coding”
is trying to mimic
access more at legalanalyticscourse.com
most vendors are selling a
largely undifferentiated product
access more at legalanalyticscourse.com
Humans are selecting
upon some “features”
of the documents
access more at legalanalyticscourse.com
to place those
documents in their
respective bins
(i.e. relevant, non-relevant)
access more at legalanalyticscourse.com
features =?
text,
author,
date,
other metadata
access more at legalanalyticscourse.com
machine learning task is
trying to recover (learn)
what separates the
relevant from the
non-relevant documents
access more...
once we learn the
rule / boundary
we can apply it to separate
the remain documents into
the two classes
access more at leg...
we want to take what we learn here
access more at legalanalyticscourse.com
we want to take what we learn here
access more at legalanalyticscourse.com
we want to take what we learn here
and apply it here
access more at legalanalyticscourse.com
By Contrast
Unsupervised Learning
access more at legalanalyticscourse.com
Pre-Clustering Documents
based on some sort of criterion
access more at legalanalyticscourse.com
Must determine
the
distance metric
(similarity index)
access more at legalanalyticscourse.com
features =?
text,
author,
date,
other metadata
distance
metric
/
similarity
index
access more at legalanalyticscourse.com
< Bias vsVariance Tradeoff >
access more at legalanalyticscourse.com
The Bias vsVariance Tradeoff
“The problem of simultaneously minimizing the
bias (how accurate a model is across different
...
The Bias vsVariance Tradeoff
http://scott.fortmann-roe.com/docs/BiasVariance.html
Worst Case Scenario
Best Case Scenario
a...
Michael Clark citing Hastie, et al (2009)
access more at legalanalyticscourse.com
< Precision vs Recall >
access more at legalanalyticscourse.com
access more at legalanalyticscourse.com
http://en.wikipedia.org/wiki/Precision_and_recall
Legal Analytics
Class 2 - Machine Learning for Lawyers
daniel martin katz
blog | ComputationalLegalStudies
corp | LexPredi...
Legal Analytics Course - Class #2 - Introduction to Machine Learning for Lawyers - Professors Daniel Martin Katz + Michael...
Legal Analytics Course - Class #2 - Introduction to Machine Learning for Lawyers - Professors Daniel Martin Katz + Michael...
Legal Analytics Course - Class #2 - Introduction to Machine Learning for Lawyers - Professors Daniel Martin Katz + Michael...
Legal Analytics Course - Class #2 - Introduction to Machine Learning for Lawyers - Professors Daniel Martin Katz + Michael...
Legal Analytics Course - Class #2 - Introduction to Machine Learning for Lawyers - Professors Daniel Martin Katz + Michael...
Legal Analytics Course - Class #2 - Introduction to Machine Learning for Lawyers - Professors Daniel Martin Katz + Michael...
Legal Analytics Course - Class #2 - Introduction to Machine Learning for Lawyers - Professors Daniel Martin Katz + Michael...
Legal Analytics Course - Class #2 - Introduction to Machine Learning for Lawyers - Professors Daniel Martin Katz + Michael...
Legal Analytics Course - Class #2 - Introduction to Machine Learning for Lawyers - Professors Daniel Martin Katz + Michael...
Legal Analytics Course - Class #2 - Introduction to Machine Learning for Lawyers - Professors Daniel Martin Katz + Michael...
Legal Analytics Course - Class #2 - Introduction to Machine Learning for Lawyers - Professors Daniel Martin Katz + Michael...
Legal Analytics Course - Class #2 - Introduction to Machine Learning for Lawyers - Professors Daniel Martin Katz + Michael...
Upcoming SlideShare
Loading in …5
×

Legal Analytics Course - Class #2 - Introduction to Machine Learning for Lawyers - Professors Daniel Martin Katz + Michael J Bommarito

4,026 views

Published on

Legal Analytics Course - Class #2 - Introduction to Machine Learning for Lawyers - Professors Daniel Martin Katz + Michael J Bommarito

Legal Analytics Course - Class #2 - Introduction to Machine Learning for Lawyers - Professors Daniel Martin Katz + Michael J Bommarito

  1. 1. Legal Analytics Professor Daniel Martin Katz Professor Michael J Bommarito II legalanalyticscourse.com Class 2 Machine Learning for Lawyers
  2. 2. < What is Machine ‘Learning’?> access more at legalanalyticscourse.com
  3. 3. we are trying to learn from existing data to infer future data rough and ready goal for ML: access more at legalanalyticscourse.com
  4. 4. “A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E.” Tom Mitchell Carnegie Mellon University access more at legalanalyticscourse.com
  5. 5. inference is tricky for a variety of reasons access more at legalanalyticscourse.com
  6. 6. data might be noisy thus hard to infer true signal (1) access more at legalanalyticscourse.com
  7. 7. the dynamics change such that past data is a bad guide for future data (2) access more at legalanalyticscourse.com
  8. 8. researcher chooses wrong method to do forecasting (3) access more at legalanalyticscourse.com
  9. 9. remember to take stock of relevant baselines access more at legalanalyticscourse.com
  10. 10. For example, how accurate is the existing method of forecasting? access more at legalanalyticscourse.com
  11. 11. sometimes it is good but far too often existing method(s) is not as good as many would think access more at legalanalyticscourse.com
  12. 12. There are 3 Known Ways to Predict Something access more at legalanalyticscourse.com
  13. 13. Experts, Crowds, Algorithms access more at legalanalyticscourse.com
  14. 14. historically decision making in law is heavily skewed toward the use of Human Experts access more at legalanalyticscourse.com
  15. 15. rarely is the historic performance of Human Experts extensively benchmarked or validated using statistical methods access more at legalanalyticscourse.com
  16. 16. in related fields this is becoming increasingly common access more at legalanalyticscourse.com
  17. 17. it is even becoming common among those who use statistical and other methods to forecast access more at legalanalyticscourse.com
  18. 18. Poll Aggregation is one form of ensemble where the learning question is to determine how much weight (if any) to assign to each individual poll
  19. 19. selecting the ulitmate ensemble of polls requires grading the historical performance of each pollster access more at legalanalyticscourse.com
  20. 20. poll weighting
  21. 21. access more at legalanalyticscourse.com
  22. 22. The goal of this course is to expose you to some of the methods used in predictive analytics access more at legalanalyticscourse.com
  23. 23. < Supervised Learning versus Unsupervised Learning> *note reinforcement learning will not be covered herein access more at legalanalyticscourse.com
  24. 24. Supervised Learning access more at legalanalyticscourse.com
  25. 25. typcially supervision is undertaken through the provision/development of “gold standard” data access more at legalanalyticscourse.com
  26. 26. Classic Example from Law is so called ‘predictive coding’ access more at legalanalyticscourse.com
  27. 27. imagine your client is served with a request for production access more at legalanalyticscourse.com
  28. 28. in random order assume this is the size of the hypothetical document set (emails, memos, etc.)
  29. 29. we can sample a subset of the documents
  30. 30. we can sample a subset of the documents
  31. 31. classification clustering regression dimension reduction access more at legalanalyticscourse.com
  32. 32. classification access more at legalanalyticscourse.com
  33. 33. access more at legalanalyticscourse.com
  34. 34. predictive coding = ~ binary classification access more at legalanalyticscourse.com
  35. 35. LearningTask = Determine Whether a Given Document is Relevant? Relevant Not Relevant f( ) relevance? Binary Classification (Supervised Learning) and/or 010 101 001 access more at legalanalyticscourse.com
  36. 36. take the sample set as a training set and use human experts access more at legalanalyticscourse.com
  37. 37. the use of the human experts is called “supervised learning” access more at legalanalyticscourse.com
  38. 38. in the simple binary case, ask humans to assign objects to two piles access more at legalanalyticscourse.com
  39. 39. Apply Human Coders access more at legalanalyticscourse.com
  40. 40. yellow = relevant white = non-relevant and return this access more at legalanalyticscourse.com
  41. 41. Non RelevantRelevant access more at legalanalyticscourse.com
  42. 42. Non RelevantRelevant gold standard data access more at legalanalyticscourse.com
  43. 43. Key Insight ... access more at legalanalyticscourse.com
  44. 44. What Allows A Human To Separate These Two Classes of Documents? access more at legalanalyticscourse.com
  45. 45. that precise human process is what “predictive coding” is trying to mimic access more at legalanalyticscourse.com
  46. 46. most vendors are selling a largely undifferentiated product access more at legalanalyticscourse.com
  47. 47. Humans are selecting upon some “features” of the documents access more at legalanalyticscourse.com
  48. 48. to place those documents in their respective bins (i.e. relevant, non-relevant) access more at legalanalyticscourse.com
  49. 49. features =? text, author, date, other metadata access more at legalanalyticscourse.com
  50. 50. machine learning task is trying to recover (learn) what separates the relevant from the non-relevant documents access more at legalanalyticscourse.com
  51. 51. once we learn the rule / boundary we can apply it to separate the remain documents into the two classes access more at legalanalyticscourse.com
  52. 52. we want to take what we learn here access more at legalanalyticscourse.com
  53. 53. we want to take what we learn here access more at legalanalyticscourse.com
  54. 54. we want to take what we learn here and apply it here access more at legalanalyticscourse.com
  55. 55. By Contrast Unsupervised Learning access more at legalanalyticscourse.com
  56. 56. Pre-Clustering Documents based on some sort of criterion access more at legalanalyticscourse.com
  57. 57. Must determine the distance metric (similarity index) access more at legalanalyticscourse.com
  58. 58. features =? text, author, date, other metadata distance metric / similarity index access more at legalanalyticscourse.com
  59. 59. < Bias vsVariance Tradeoff > access more at legalanalyticscourse.com
  60. 60. The Bias vsVariance Tradeoff “The problem of simultaneously minimizing the bias (how accurate a model is across different training sets) and variance of the model error (how sensitive the model is to small changes in training set). Intuitively, it means that a model must be chosen that at the same time captures the regularities in its training data, but also generalizes well to unseen data.” access more at legalanalyticscourse.com
  61. 61. The Bias vsVariance Tradeoff http://scott.fortmann-roe.com/docs/BiasVariance.html Worst Case Scenario Best Case Scenario access more at legalanalyticscourse.com
  62. 62. Michael Clark citing Hastie, et al (2009) access more at legalanalyticscourse.com
  63. 63. < Precision vs Recall > access more at legalanalyticscourse.com
  64. 64. access more at legalanalyticscourse.com
  65. 65. http://en.wikipedia.org/wiki/Precision_and_recall
  66. 66. Legal Analytics Class 2 - Machine Learning for Lawyers daniel martin katz blog | ComputationalLegalStudies corp | LexPredict michael j bommarito twitter | @computational blog | ComputationalLegalStudies corp | LexPredict twitter | @mjbommar more content available at legalanalyticscourse.com site | danielmartinkatz.com site | bommaritollc.com

×