daniel martin katz
michael j bommarito
adjunct professor @ university of michigan
associate professor of law @ illinois te...
Forum on Legal Evolution
NYC
in the legal
industry
there already
is better corn
"We always overestimate the change that will
occur in the next two years and underestimate
the change that will occur in t...
today’s focus is primarily
legal analytics + process engineering
© daniel martin katz michael j bommarito
before providing some
concrete examples -
some broad thoughts ...
© daniel martin katz michael j bommarito
three faces of innovation in legal
© daniel martin katz michael j bommarito
(1) lawyers for innovators / entrepreneurs
© daniel martin katz michael j bommarito
© daniel martin katz michael j bommarito
what most lawyers and law schools think of as “Law+Entrepreneurship"
(1) lawyers ...
(2) lawyers as innovators - substance
© daniel martin katz michael j bommarito
poison pill - “the most important innovation in corporate law
since Samuel Calvin Tate Dodd invented the trust
for John D....
emerging areas - 3D Printing, Driverless Cars, Augmented Reality,
Data Breach, Big Data+Privacy, etc.
Drones, Internet of ...
(3) lawyers as innovators - business/process
© daniel martin katz michael j bommarito
innovation directed toward transforming the practice of law
© daniel martin katz michael j bommarito
(3) lawyers as innova...
© daniel martin katz michael j bommarito
© daniel martin katz michael j bommarito
there are different ways that
organizations are innovating
on the third face
{Law
Substantive
Legal
Expertise
Analytics
Platform
AI
Computing
Process Mapping
User Experience
Design Thinking
Business ...
© daniel martin katz michael j bommarito
© daniel martin katz michael j bommarito
some traditional law firms
have been very aggressive
© daniel martin katz michael j bommarito
© daniel martin katz michael j bommarito
but most of the innovation
is Lex.Startup
© daniel martin katz michael j bommarito
Lex.Startup
is beginning to take hold
15
2009
Lex.Startup
15
2009
Lex.Startup
15 425+
2009 2014
Law or Legal Related Companies*
as highlighted by Josh Kubicki @ ReInventLaw London 2013
Lex.Startup
© daniel martin katz michael j bommarito
So what are these folks doing?
R + D Function in the
Legal Industry
© daniel martin katz michael j bommarito
We Could Imagine a World
Where Law Firms Did the
R+D for the Industry
© daniel martin katz michael j bommarito
But That Has
(Mostly) Proven Illusive
© daniel martin katz michael j bommarito
© daniel martin katz michael j bommarito
Lex.Startup
is undertaking that function
© daniel martin katz michael j bommarito
Here are the specific
approaches that are
being undertaken
© daniel martin katz michael j bommarito
Some organizations are
doing more than one
© daniel martin katz michael j bommarito
labor
arbitrage
© daniel martin katz michael j bommarito
labor
arbitrage
process/tech
arbitrage
© daniel martin katz michael j bommarito
labor
arbitrage
process/tech
arbitrage
regulatory
arbitrage
© daniel martin katz michael j bommarito
labor
arbitrage
process/tech
arbitrage
regulatory
arbitrage
design as the
ultimat...
© daniel martin katz michael j bommarito
labor
arbitrage
process/tech
arbitrage
regulatory
arbitrage
design as the
ultimat...
© daniel martin katz michael j bommarito
could do an
individual talk on
any of these topics...
© daniel martin katz michael j bommarito
labor
arbitrage
process/tech
arbitrage
regulatory
arbitrage
design as the
ultimat...
© daniel martin katz michael j bommarito
labor
arbitrage
process/tech
arbitrage
regulatory
arbitrage
design as the
ultimat...
© daniel martin katz michael j bommarito
Predictive Analytics in Law
© daniel martin katz michael j bommarito
© daniel martin katz michael j bommarito
The Data Driven
Future of the
Legal Industry
© daniel martin katz michael j bommarito
is already here...
© daniel martin katz michael j bommarito
2011
© daniel martin katz michael j bommarito
2011
© daniel martin katz michael j bommarito
2012
© daniel martin katz michael j bommarito
2013
© daniel martin katz michael j bommarito
2013
© daniel martin katz michael j bommarito
2013
© daniel martin katz michael j bommarito
2013
2013
© daniel martin katz michael j bommarito
2013
© daniel martin katz michael j bommarito
2013
© daniel martin katz michael j bommarito
Quantitative Legal Prediction
- or -
How I Learned to Stop Worrying and
Start Pre...
© daniel martin katz michael j bommarito
Cause
and
Effect
Quantitative
Legal
Prediction
vs.
© daniel martin katz michael j bommarito
Cause
and
Effect
Quantitative
Legal
Prediction
vs.
© daniel martin katz michael j bommarito
Machine Learning
is the heart of
predictive analytics
Legal Analytics
Professor Daniel Martin Katz
Professor Michael J Bommarito II
@MSU Law - Winter 2014
© daniel martin katz ...
Supervised
Statistical models
Bayesian, e.g., Naïve Bayes Classification
Frequentist, e.g., Ordinary Least Squares
Neural ...
http://scikit-learn.org/stable/tutorial/machine_learning_map/index.html
© daniel martin katz michael j bommarito
classification
clustering
regression
dimension reduction
the family of machine learning methods © daniel martin katz micha...
Quick Example of
the Methods
© daniel martin katz michael j bommarito
© daniel martin katz michael j bommarito
Adapted from Slides By
Victor Lavrenko and Nigel Goddard
@ University of Edinburg...
© daniel martin katz michael j bommarito
72
Female
Human
3
Female
Horse
36
Male
Human
21
Male
Human
67
Male
Human
29
Femal...
© daniel martin katz michael j bommarito
Classification
(Supervised Learning)
decision
boundary
female
male
f( )
Gender?
© daniel martin katz michael j bommarito
Classification
(Supervised Learning)
decision
boundary
female
male
f( )
Gender?
Re...
© daniel martin katz michael j bommarito
Classification
(Supervised Learning)
decision
boundary
female
male
f( )
Gender?
f(...
© daniel martin katz michael j bommarito
Classification
(Supervised Learning)
decision
boundary
female
male
f( )
Gender?
f(...
© daniel martin katz michael j bommarito
Regression as a Prediction Tool
© daniel martin katz michael j bommarito
Regression as a Prediction Tool
© daniel martin katz michael j bommarito
Standard Linear Regression
Can Be Used to
Predict a Probability
(using LPM, Logit...
© daniel martin katz michael j bommarito
Standard Linear Regression
Can Be Used to
Predict a Quantity
© daniel martin katz michael j bommarito
Task = Predict the Expected Cost of
a Given Legal Service
f( )
Cost?
#
and/or
010...
© daniel martin katz michael j bommarito
http://reinventlawchannel.com/ron-gruner-were-on-a-mission/
© daniel martin katz michael j bommarito
Y = βo +/- β1 ( X1 ) +/- β2 ( X2 ) +/- β3 ( X3 ) +/- β4 ( X3 ) +/- β5 ( X3 ) + ε
...
© daniel martin katz michael j bommarito
Turn Around and
Use This Model
To Predict Other Lawyers
(also Matters, etc.)
© daniel martin katz michael j bommarito
This Requires a Method to Deal
With Changes in Dynamics, etc.
© daniel martin katz michael j bommarito
This Requires a Method to Update
the Model as Time Moves Forward
© daniel martin katz michael j bommarito
Must Deal With Overfitting
to the Existing Data
© daniel martin katz michael j bommarito
Machine Learning and
the Future of E-Discovery
© daniel martin katz michael j bommarito
imagine your client is served
with a request for production
© daniel martin katz michael j bommarito
in random
order
assume
this is the
size
of the
hypothetical
document
set
(emails,
memos,
etc.)
we can
sample
a subset
of the
documents
we can
sample
a subset
of the
documents
© daniel martin katz michael j bommarito
classification
clustering
regression
dimension reduction
© daniel martin katz michael j bommarito
classification
© daniel martin katz michael j bommarito
predictive coding =
~ binary classification
© daniel martin katz michael j bommarito
© daniel martin katz michael j bommarito
LearningTask = Determine Whether a Given
Document is Relevant?
Relevant
Not Relev...
take the sample set as
a training set and
use human experts
© daniel martin katz michael j bommarito
the use of the human
experts is called
“supervised learning”
© daniel martin katz michael j bommarito
in the simple binary case,
ask humans to assign
objects to two piles
© daniel martin katz michael j bommarito
Apply Human Coders
© daniel martin katz michael j bommarito
yellow = relevant
white = non-relevant
and return this
© daniel martin katz michael j bommarito
Non RelevantRelevant
© daniel martin katz michael j bommarito
Key Insight ...
© daniel martin katz michael j bommarito
What Allows A
Human To Separate
These Two Classes of
Documents?
© daniel martin katz michael j bommarito
that precise human
process is what
“predictive coding”
is trying to mimic
© daniel martin katz michael j bommarito
most vendors are selling a
largely undifferentiated product
© daniel martin katz michael j bommarito
Humans are selecting
upon some “features”
of the documents
© daniel martin katz michael j bommarito
to place those
documents in their
respective bins

(i.e. relevant, non-relevant)
© daniel martin katz michael j bommarito
features =?
text,
author,
date,
other metadata
© daniel martin katz michael j bommarito
machine learning task is
trying to recover (learn)
what separates the
relevant from the
non-relevant documents
© daniel ma...
once we learn the
rule / boundary
we can apply it to separate
the remain documents into
the two classes
© daniel martin ka...
© daniel martin katz michael j bommarito
we want to take what we learn here
© daniel martin katz michael j bommarito
we want to take what we learn here
© daniel martin katz michael j bommarito
we want to take what we learn here
and apply it here
© daniel martin katz michael j bommarito
the future of e-discovery
will follow the arc of
machine learning
© daniel martin katz michael j bommarito
Supervised Unsupervised
Predictive
Coding
(Classification)
The Long
Term Future
Machine
Learning
Methods
2 x 2
Informed
Na...
there are different
forms of learning
by machines ...
© daniel martin katz michael j bommarito
There Is Learning
Within a Matter
(i.e. learning from a
specific training set)
© daniel martin katz michael j bommarito
In other words, it is
possible for the
machine to learn from
the experience of
having processed
documents in the past
© da...
both inside a given
company but also
across companies ...
© daniel martin katz michael j bommarito
this is how
data aggregation / reusing data
becomes very powerful
© daniel martin katz michael j bommarito
data aggregation / reusing data
make the naive into the informed
© daniel martin katz michael j bommarito
data aggregation / reusing data
help move from the supervised
to the semi/unsupervised
© daniel martin katz michael j bomm...
Supervised Unsupervised
Predictive
Coding
(Classification)
The Future
Machine
Learning
Methods
2 x 2
Informed
Naive
Basic
...
© daniel martin katz michael j bommarito
Machine Learning
Natural Language Processing
and Due Diligence
© daniel martin katz michael j bommarito
© daniel martin katz michael j bommarito
© daniel martin katz michael j bommarito
The system comes pre-trained 

for provisions including:
Title, Parties, Date, Te...
Based on testing, we know our system finds
90% or more of the instances of nearly
every substantive provision it covers.
T...
We are able to build custom provisions on
request. Thanks to our highly customized
training algorithms, this process is ea...
© daniel martin katz michael j bommarito
Machine Learning
and
Judicial Behavior
© daniel martin katz michael j bommarito
© daniel martin katz michael j bommarito
© daniel martin katz michael j bommarito
© daniel martin katz michael j bommarito
© daniel martin katz michael j bommarito
2002 Prediction Tourney
and its limits
© daniel martin katz michael j bommarito
© daniel martin katz michael j bommarito
Model Leverages
Classification Tree
(Tool from Machine Learning)
Standard Decision Tree
Often Not Generalizable,
Often Overfits the Data
© daniel martin katz michael j bommarito
Need a more complex approach
© daniel martin katz michael j bommarito
Predicting the Behavior of the
United States Supreme Court:
A General Approach
© daniel martin katz michael j bommarito
Bl...
feature engineering
© daniel martin katz michael j bommarito
The real world gives us raw material, at best.
 Typically, yo...
similar approach can be
applied to other problems
© daniel martin katz michael j bommarito
© daniel martin katz michael j bommarito
© daniel martin katz michael j bommarito
Case Prediction
and
Litigation Data
© daniel martin katz michael j bommarito
© daniel martin katz michael j bommarito
© daniel martin katz michael j bommarito
© daniel martin katz michael j bommarito
“John Dragseth, a principal at
Fish & Richardson (the most active
IP litigation f...
© daniel martin katz michael j bommarito
Notice there is an
offloading of data but
it is up to the end user
to derive mean...
© daniel martin katz michael j bommarito
In general, the relevant
consumer market is not
yet mature when it
comes to data ...
© daniel martin katz michael j bommarito
Difficult to sell machine
learning technology in
instances where the
end user doe...
© daniel martin katz michael j bommarito
© daniel martin katz michael j bommarito
Many other examples ...
just starting to come online
© daniel martin katz michael j bommarito
Attorney Quality
and Performance
© daniel martin katz michael j bommarito
Leveraging Public Data
for Legal Insight
© daniel martin katz michael j bommarito
© daniel martin katz michael j bommarito
Change
Management
is the Hardest
Innovation of All
© daniel martin katz michael j bommarito
© daniel martin katz michael j bommarito
Bulls and Bears
~1984 - 2009 ~2009 - 2014
© daniel martin katz michael j bommarito
53 in 2009
58 in 2014
If you were
28 in 1984
than you were
© daniel martin katz michael j bommarito
before 2009 most of
the individuals in the
profession have only
known the bull ma...
© daniel martin katz michael j bommarito
it is a bear market now ...
and in a bear market you
need a serious strategy
© daniel martin katz michael j bommarito
analytics/data should be
part of that strategy
© daniel martin katz michael j bommarito
“data is the oil of the 21st Century”
So lets be wildcatters
© daniel martin katz michael j bommarito
law < > finance
many elements in law look
like finance did 25 years ago
© daniel martin katz michael j bommarito
© daniel martin katz michael j bommarito
When it comes to
innovation at the
level that is going
to be needed ...
© daniel martin katz michael j bommarito
Assigning a innovation partner
or an innovation committee is
probably not enough
© daniel martin katz michael j bommarito
Shunk Works
© daniel martin katz michael j bommarito
how many
organizations have a
full time data scientist
(data science team)?
© daniel martin katz michael j bommarito
need a full scale
and empowered
R+D team
(data science, tech, etc.)
© daniel martin katz michael j bommarito
Final Thought
Exit,
Voice
&
Loyalty
© daniel martin katz michael j bommarito
daniel martin katz
michael j bommarito ii
adjunct professor of Law @ michigan state university
associate professor of law ...
 Legal Analytics, Machine Learning and Some Comments on the Status of Innovation in the Legal Industry - Professors Daniel...
 Legal Analytics, Machine Learning and Some Comments on the Status of Innovation in the Legal Industry - Professors Daniel...
 Legal Analytics, Machine Learning and Some Comments on the Status of Innovation in the Legal Industry - Professors Daniel...
 Legal Analytics, Machine Learning and Some Comments on the Status of Innovation in the Legal Industry - Professors Daniel...
 Legal Analytics, Machine Learning and Some Comments on the Status of Innovation in the Legal Industry - Professors Daniel...
 Legal Analytics, Machine Learning and Some Comments on the Status of Innovation in the Legal Industry - Professors Daniel...
 Legal Analytics, Machine Learning and Some Comments on the Status of Innovation in the Legal Industry - Professors Daniel...
 Legal Analytics, Machine Learning and Some Comments on the Status of Innovation in the Legal Industry - Professors Daniel...
 Legal Analytics, Machine Learning and Some Comments on the Status of Innovation in the Legal Industry - Professors Daniel...
 Legal Analytics, Machine Learning and Some Comments on the Status of Innovation in the Legal Industry - Professors Daniel...
 Legal Analytics, Machine Learning and Some Comments on the Status of Innovation in the Legal Industry - Professors Daniel...
 Legal Analytics, Machine Learning and Some Comments on the Status of Innovation in the Legal Industry - Professors Daniel...
 Legal Analytics, Machine Learning and Some Comments on the Status of Innovation in the Legal Industry - Professors Daniel...
 Legal Analytics, Machine Learning and Some Comments on the Status of Innovation in the Legal Industry - Professors Daniel...
 Legal Analytics, Machine Learning and Some Comments on the Status of Innovation in the Legal Industry - Professors Daniel...
 Legal Analytics, Machine Learning and Some Comments on the Status of Innovation in the Legal Industry - Professors Daniel...
 Legal Analytics, Machine Learning and Some Comments on the Status of Innovation in the Legal Industry - Professors Daniel...
 Legal Analytics, Machine Learning and Some Comments on the Status of Innovation in the Legal Industry - Professors Daniel...
Upcoming SlideShare
Loading in …5
×

Legal Analytics, Machine Learning and Some Comments on the Status of Innovation in the Legal Industry - Professors Daniel Martin Katz & Michael J Bommarito II - Presentation @ The Forum on Legal Evolution- NYC

6,611 views
6,404 views

Published on

0 Comments
12 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
6,611
On SlideShare
0
From Embeds
0
Number of Embeds
3,647
Actions
Shares
0
Downloads
2
Comments
0
Likes
12
Embeds 0
No embeds

No notes for slide

Legal Analytics, Machine Learning and Some Comments on the Status of Innovation in the Legal Industry - Professors Daniel Martin Katz & Michael J Bommarito II - Presentation @ The Forum on Legal Evolution- NYC

  1. 1. daniel martin katz michael j bommarito adjunct professor @ university of michigan associate professor of law @ illinois tech - chicago kent co-founder @ LexPredict co-founder @ LexPredict Legal Analytics, Machine Learning and Some Comments on the Status of Innovation in the Legal Industry
  2. 2. Forum on Legal Evolution NYC
  3. 3. in the legal industry there already is better corn
  4. 4. "We always overestimate the change that will occur in the next two years and underestimate the change that will occur in the next ten” - bill gates
  5. 5. today’s focus is primarily legal analytics + process engineering © daniel martin katz michael j bommarito
  6. 6. before providing some concrete examples - some broad thoughts ... © daniel martin katz michael j bommarito
  7. 7. three faces of innovation in legal © daniel martin katz michael j bommarito
  8. 8. (1) lawyers for innovators / entrepreneurs © daniel martin katz michael j bommarito
  9. 9. © daniel martin katz michael j bommarito what most lawyers and law schools think of as “Law+Entrepreneurship" (1) lawyers for innovators / entrepreneurs
  10. 10. (2) lawyers as innovators - substance © daniel martin katz michael j bommarito
  11. 11. poison pill - “the most important innovation in corporate law since Samuel Calvin Tate Dodd invented the trust for John D. Rockefeller and Standard Oil in 1879” © daniel martin katz michael j bommarito (2) lawyers as innovators - substance
  12. 12. emerging areas - 3D Printing, Driverless Cars, Augmented Reality, Data Breach, Big Data+Privacy, etc. Drones, Internet of Things, CyberSecurity, © daniel martin katz michael j bommarito (2) lawyers as innovators - substance
  13. 13. (3) lawyers as innovators - business/process © daniel martin katz michael j bommarito
  14. 14. innovation directed toward transforming the practice of law © daniel martin katz michael j bommarito (3) lawyers as innovators - business/process
  15. 15. © daniel martin katz michael j bommarito
  16. 16. © daniel martin katz michael j bommarito there are different ways that organizations are innovating on the third face
  17. 17. {Law Substantive Legal Expertise Analytics Platform AI Computing Process Mapping User Experience Design Thinking Business Models Regulation Marketing + Tech + Design TM + Delivery} © daniel martin katz michael j bommarito
  18. 18. © daniel martin katz michael j bommarito
  19. 19. © daniel martin katz michael j bommarito some traditional law firms have been very aggressive
  20. 20. © daniel martin katz michael j bommarito
  21. 21. © daniel martin katz michael j bommarito but most of the innovation is Lex.Startup
  22. 22. © daniel martin katz michael j bommarito Lex.Startup is beginning to take hold
  23. 23. 15 2009 Lex.Startup
  24. 24. 15 2009 Lex.Startup
  25. 25. 15 425+ 2009 2014 Law or Legal Related Companies* as highlighted by Josh Kubicki @ ReInventLaw London 2013 Lex.Startup
  26. 26. © daniel martin katz michael j bommarito So what are these folks doing?
  27. 27. R + D Function in the Legal Industry © daniel martin katz michael j bommarito
  28. 28. We Could Imagine a World Where Law Firms Did the R+D for the Industry © daniel martin katz michael j bommarito
  29. 29. But That Has (Mostly) Proven Illusive © daniel martin katz michael j bommarito
  30. 30. © daniel martin katz michael j bommarito Lex.Startup is undertaking that function
  31. 31. © daniel martin katz michael j bommarito Here are the specific approaches that are being undertaken
  32. 32. © daniel martin katz michael j bommarito Some organizations are doing more than one
  33. 33. © daniel martin katz michael j bommarito labor arbitrage
  34. 34. © daniel martin katz michael j bommarito labor arbitrage process/tech arbitrage
  35. 35. © daniel martin katz michael j bommarito labor arbitrage process/tech arbitrage regulatory arbitrage
  36. 36. © daniel martin katz michael j bommarito labor arbitrage process/tech arbitrage regulatory arbitrage design as the ultimate bespoke
  37. 37. © daniel martin katz michael j bommarito labor arbitrage process/tech arbitrage regulatory arbitrage design as the ultimate bespoke predictive analytics
  38. 38. © daniel martin katz michael j bommarito could do an individual talk on any of these topics...
  39. 39. © daniel martin katz michael j bommarito labor arbitrage process/tech arbitrage regulatory arbitrage design as the ultimate bespoke predictive analytics
  40. 40. © daniel martin katz michael j bommarito labor arbitrage process/tech arbitrage regulatory arbitrage design as the ultimate bespoke predictive analytics
  41. 41. © daniel martin katz michael j bommarito
  42. 42. Predictive Analytics in Law © daniel martin katz michael j bommarito
  43. 43. © daniel martin katz michael j bommarito The Data Driven Future of the Legal Industry
  44. 44. © daniel martin katz michael j bommarito is already here...
  45. 45. © daniel martin katz michael j bommarito 2011
  46. 46. © daniel martin katz michael j bommarito 2011
  47. 47. © daniel martin katz michael j bommarito 2012
  48. 48. © daniel martin katz michael j bommarito 2013
  49. 49. © daniel martin katz michael j bommarito 2013
  50. 50. © daniel martin katz michael j bommarito 2013
  51. 51. © daniel martin katz michael j bommarito 2013
  52. 52. 2013
  53. 53. © daniel martin katz michael j bommarito 2013
  54. 54. © daniel martin katz michael j bommarito 2013
  55. 55. © daniel martin katz michael j bommarito Quantitative Legal Prediction - or - How I Learned to Stop Worrying and Start Preparing for the Data Driven Future of the Legal Services Industry Daniel Martin Katz Associate Professor of Law Michigan State University 62 Emory L. J. 909 (2013)
  56. 56. © daniel martin katz michael j bommarito Cause and Effect Quantitative Legal Prediction vs.
  57. 57. © daniel martin katz michael j bommarito Cause and Effect Quantitative Legal Prediction vs.
  58. 58. © daniel martin katz michael j bommarito Machine Learning is the heart of predictive analytics
  59. 59. Legal Analytics Professor Daniel Martin Katz Professor Michael J Bommarito II @MSU Law - Winter 2014 © daniel martin katz michael j bommarito
  60. 60. Supervised Statistical models Bayesian, e.g., Naïve Bayes Classification Frequentist, e.g., Ordinary Least Squares Neural Networks (NN) Support Vector Machines (SVM) Random Forests (RF) Genetic Algorithms (GA) Semi/Unsupervised Neural Networks (NN) Clustering K-means Hierarchical Radial Basis (RBF) Graph Some Machine Learning Methods © daniel martin katz michael j bommarito
  61. 61. http://scikit-learn.org/stable/tutorial/machine_learning_map/index.html © daniel martin katz michael j bommarito
  62. 62. classification clustering regression dimension reduction the family of machine learning methods © daniel martin katz michael j bommarito
  63. 63. Quick Example of the Methods © daniel martin katz michael j bommarito
  64. 64. © daniel martin katz michael j bommarito Adapted from Slides By Victor Lavrenko and Nigel Goddard @ University of Edinburgh Take A LookThese 12
  65. 65. © daniel martin katz michael j bommarito 72 Female Human 3 Female Horse 36 Male Human 21 Male Human 67 Male Human 29 Female Human 54 Male Human 44 Male Human 50 Male Human 42 Female Human 6 Male Dog 7 Female Human
  66. 66. © daniel martin katz michael j bommarito Classification (Supervised Learning) decision boundary female male f( ) Gender?
  67. 67. © daniel martin katz michael j bommarito Classification (Supervised Learning) decision boundary female male f( ) Gender? Regression (Supervised Learning) #f( ) Age? 723 2 3 67 54 29 42 44 50 7 6 27 44 53 3 68 2 48 10 6 743 4 4
  68. 68. © daniel martin katz michael j bommarito Classification (Supervised Learning) decision boundary female male f( ) Gender? f( ) Loan Application? Yes Multi Class Classification (Supervised Learning) No Maybe Yes Perhaps No Multiclass = Boundary Hyperplane Regression (Supervised Learning) #f( ) Age? 723 2 3 67 54 29 42 44 50 7 6 27 44 53 3 68 2 48 10 6 743 4 4
  69. 69. © daniel martin katz michael j bommarito Classification (Supervised Learning) decision boundary female male f( ) Gender? f( ) Loan Application? Yes Multi Class Classification (Supervised Learning) No Maybe Yes Perhaps No Multiclass = Boundary Hyperplane Regression (Supervised Learning) #f( ) Age? 723 2 3 67 54 29 42 44 50 7 6 27 44 53 3 68 2 48 10 6 743 4 4 Clustering (Unsupervised Learning) Clusterf( ) Group?
  70. 70. © daniel martin katz michael j bommarito Regression as a Prediction Tool
  71. 71. © daniel martin katz michael j bommarito Regression as a Prediction Tool
  72. 72. © daniel martin katz michael j bommarito Standard Linear Regression Can Be Used to Predict a Probability (using LPM, Logit, etc.)
  73. 73. © daniel martin katz michael j bommarito Standard Linear Regression Can Be Used to Predict a Quantity
  74. 74. © daniel martin katz michael j bommarito Task = Predict the Expected Cost of a Given Legal Service f( ) Cost? # and/or 010 101 001 Regression (Supervised Learning)
  75. 75. © daniel martin katz michael j bommarito http://reinventlawchannel.com/ron-gruner-were-on-a-mission/
  76. 76. © daniel martin katz michael j bommarito Y = βo +/- β1 ( X1 ) +/- β2 ( X2 ) +/- β3 ( X3 ) +/- β4 ( X3 ) +/- β5 ( X3 ) + ε Y = $151 + $15 ( ) + 161 ( ) + 95 ( ) + 34 ( ) +/- β5 ( ) + ε Per 100 Lawyers If Tier 1 Market is True Partner Status is True Per 10 Years Practice Area
  77. 77. © daniel martin katz michael j bommarito Turn Around and Use This Model To Predict Other Lawyers (also Matters, etc.)
  78. 78. © daniel martin katz michael j bommarito This Requires a Method to Deal With Changes in Dynamics, etc.
  79. 79. © daniel martin katz michael j bommarito This Requires a Method to Update the Model as Time Moves Forward
  80. 80. © daniel martin katz michael j bommarito Must Deal With Overfitting to the Existing Data
  81. 81. © daniel martin katz michael j bommarito
  82. 82. Machine Learning and the Future of E-Discovery © daniel martin katz michael j bommarito
  83. 83. imagine your client is served with a request for production © daniel martin katz michael j bommarito
  84. 84. in random order assume this is the size of the hypothetical document set (emails, memos, etc.)
  85. 85. we can sample a subset of the documents
  86. 86. we can sample a subset of the documents
  87. 87. © daniel martin katz michael j bommarito classification clustering regression dimension reduction
  88. 88. © daniel martin katz michael j bommarito classification
  89. 89. © daniel martin katz michael j bommarito
  90. 90. predictive coding = ~ binary classification © daniel martin katz michael j bommarito
  91. 91. © daniel martin katz michael j bommarito LearningTask = Determine Whether a Given Document is Relevant? Relevant Not Relevant f( ) relevance? Binary Classification (Supervised Learning) and/or 010 101 001
  92. 92. take the sample set as a training set and use human experts © daniel martin katz michael j bommarito
  93. 93. the use of the human experts is called “supervised learning” © daniel martin katz michael j bommarito
  94. 94. in the simple binary case, ask humans to assign objects to two piles © daniel martin katz michael j bommarito
  95. 95. Apply Human Coders © daniel martin katz michael j bommarito
  96. 96. yellow = relevant white = non-relevant and return this © daniel martin katz michael j bommarito
  97. 97. Non RelevantRelevant © daniel martin katz michael j bommarito
  98. 98. Key Insight ... © daniel martin katz michael j bommarito
  99. 99. What Allows A Human To Separate These Two Classes of Documents? © daniel martin katz michael j bommarito
  100. 100. that precise human process is what “predictive coding” is trying to mimic © daniel martin katz michael j bommarito
  101. 101. most vendors are selling a largely undifferentiated product © daniel martin katz michael j bommarito
  102. 102. Humans are selecting upon some “features” of the documents © daniel martin katz michael j bommarito
  103. 103. to place those documents in their respective bins
 (i.e. relevant, non-relevant) © daniel martin katz michael j bommarito
  104. 104. features =? text, author, date, other metadata © daniel martin katz michael j bommarito
  105. 105. machine learning task is trying to recover (learn) what separates the relevant from the non-relevant documents © daniel martin katz michael j bommarito
  106. 106. once we learn the rule / boundary we can apply it to separate the remain documents into the two classes © daniel martin katz michael j bommarito
  107. 107. © daniel martin katz michael j bommarito we want to take what we learn here
  108. 108. © daniel martin katz michael j bommarito we want to take what we learn here
  109. 109. © daniel martin katz michael j bommarito we want to take what we learn here and apply it here
  110. 110. © daniel martin katz michael j bommarito
  111. 111. the future of e-discovery will follow the arc of machine learning © daniel martin katz michael j bommarito
  112. 112. Supervised Unsupervised Predictive Coding (Classification) The Long Term Future Machine Learning Methods 2 x 2 Informed Naive Basic Clustering Algorithm © daniel martin katz michael j bommarito
  113. 113. there are different forms of learning by machines ... © daniel martin katz michael j bommarito
  114. 114. There Is Learning Within a Matter (i.e. learning from a specific training set) © daniel martin katz michael j bommarito
  115. 115. In other words, it is possible for the machine to learn from the experience of having processed documents in the past © daniel martin katz michael j bommarito
  116. 116. both inside a given company but also across companies ... © daniel martin katz michael j bommarito
  117. 117. this is how data aggregation / reusing data becomes very powerful © daniel martin katz michael j bommarito
  118. 118. data aggregation / reusing data make the naive into the informed © daniel martin katz michael j bommarito
  119. 119. data aggregation / reusing data help move from the supervised to the semi/unsupervised © daniel martin katz michael j bommarito
  120. 120. Supervised Unsupervised Predictive Coding (Classification) The Future Machine Learning Methods 2 x 2 Informed Naive Basic Clustering Algorithm © daniel martin katz michael j bommarito
  121. 121. © daniel martin katz michael j bommarito
  122. 122. Machine Learning Natural Language Processing and Due Diligence © daniel martin katz michael j bommarito
  123. 123. © daniel martin katz michael j bommarito
  124. 124. © daniel martin katz michael j bommarito The system comes pre-trained 
 for provisions including: Title, Parties, Date, Term, Change of Control, Assignment, Indemnity, Confidentiality, Governing Law, License Grant, Bankruptcy, Notice, Amendment, Non-Solicit, and more.
  125. 125. Based on testing, we know our system finds 90% or more of the instances of nearly every substantive provision it covers. This 90% number is our system’s recall; its precision differs by provision by provision but is consistently very manageable. © daniel martin katz michael j bommarito
  126. 126. We are able to build custom provisions on request. Thanks to our highly customized training algorithms, this process is easy and relatively automated. We are also engaged in adding more provisions. © daniel martin katz michael j bommarito
  127. 127. © daniel martin katz michael j bommarito
  128. 128. Machine Learning and Judicial Behavior © daniel martin katz michael j bommarito
  129. 129. © daniel martin katz michael j bommarito
  130. 130. © daniel martin katz michael j bommarito
  131. 131. © daniel martin katz michael j bommarito
  132. 132. © daniel martin katz michael j bommarito
  133. 133. 2002 Prediction Tourney and its limits © daniel martin katz michael j bommarito
  134. 134. © daniel martin katz michael j bommarito Model Leverages Classification Tree (Tool from Machine Learning)
  135. 135. Standard Decision Tree Often Not Generalizable, Often Overfits the Data © daniel martin katz michael j bommarito
  136. 136. Need a more complex approach © daniel martin katz michael j bommarito
  137. 137. Predicting the Behavior of the United States Supreme Court: A General Approach © daniel martin katz michael j bommarito Black Reed Frankfurter Douglas Jackson Burton Clark Minton Warren Harlan Brennan Whittaker Stewart White Goldberg Fortas Marshall Burger Blackmun Powell Rehnquist Stevens OConnor Scalia Kennedy Souter Thomas Ginsburg Breyer Roberts Alito Sotomayor Kagan 1953 1963 1973 1983 1993 2003 2013 9-0 Reverse 8-1, 7-2, 6-3 19 19 19 19 19 20 20 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 - Reverse 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 - 8-1, 7-2, 6-3 9-0 19 19 19 19 19 20 20
  138. 138. feature engineering © daniel martin katz michael j bommarito The real world gives us raw material, at best.  Typically, you even have to dig the stuff raw material out of your own unstructured data
  139. 139. similar approach can be applied to other problems © daniel martin katz michael j bommarito
  140. 140. © daniel martin katz michael j bommarito
  141. 141. © daniel martin katz michael j bommarito Case Prediction and Litigation Data
  142. 142. © daniel martin katz michael j bommarito
  143. 143. © daniel martin katz michael j bommarito
  144. 144. © daniel martin katz michael j bommarito
  145. 145. © daniel martin katz michael j bommarito “John Dragseth, a principal at Fish & Richardson (the most active IP litigation firm in the United States, according to Corporate Counsel magazine), credits Lex Machina’s database with helping him spot meaningful but otherwise hidden trends in IP litigation—and he won’t give details. “If you published it, then people on the other side would know,” he says.
  146. 146. © daniel martin katz michael j bommarito Notice there is an offloading of data but it is up to the end user to derive meaning
  147. 147. © daniel martin katz michael j bommarito In general, the relevant consumer market is not yet mature when it comes to data science
  148. 148. © daniel martin katz michael j bommarito Difficult to sell machine learning technology in instances where the end user does not have the right assets in place
  149. 149. © daniel martin katz michael j bommarito
  150. 150. © daniel martin katz michael j bommarito Many other examples ... just starting to come online
  151. 151. © daniel martin katz michael j bommarito Attorney Quality and Performance
  152. 152. © daniel martin katz michael j bommarito Leveraging Public Data for Legal Insight
  153. 153. © daniel martin katz michael j bommarito
  154. 154. © daniel martin katz michael j bommarito
  155. 155. Change Management is the Hardest Innovation of All © daniel martin katz michael j bommarito
  156. 156. © daniel martin katz michael j bommarito Bulls and Bears ~1984 - 2009 ~2009 - 2014
  157. 157. © daniel martin katz michael j bommarito 53 in 2009 58 in 2014 If you were 28 in 1984 than you were
  158. 158. © daniel martin katz michael j bommarito before 2009 most of the individuals in the profession have only known the bull market
  159. 159. © daniel martin katz michael j bommarito it is a bear market now ... and in a bear market you need a serious strategy
  160. 160. © daniel martin katz michael j bommarito analytics/data should be part of that strategy
  161. 161. © daniel martin katz michael j bommarito “data is the oil of the 21st Century”
  162. 162. So lets be wildcatters
  163. 163. © daniel martin katz michael j bommarito law < > finance many elements in law look like finance did 25 years ago
  164. 164. © daniel martin katz michael j bommarito
  165. 165. © daniel martin katz michael j bommarito When it comes to innovation at the level that is going to be needed ...
  166. 166. © daniel martin katz michael j bommarito Assigning a innovation partner or an innovation committee is probably not enough
  167. 167. © daniel martin katz michael j bommarito Shunk Works
  168. 168. © daniel martin katz michael j bommarito how many organizations have a full time data scientist (data science team)?
  169. 169. © daniel martin katz michael j bommarito need a full scale and empowered R+D team (data science, tech, etc.)
  170. 170. © daniel martin katz michael j bommarito Final Thought
  171. 171. Exit, Voice & Loyalty © daniel martin katz michael j bommarito
  172. 172. daniel martin katz michael j bommarito ii adjunct professor of Law @ michigan state university associate professor of law @ illinois tech - chicago kent co-founder @ LexPredict director of research @ reInventLaw laboratory co-founder @ LexPredict Forum on Legal Evolution NYC

×