Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
The Three Forms
of (Legal) Prediction
professor daniel martin katz
home | Illinois tech - chicago kent
blog | Computationa...
Three Types of Lawyers
(as described by paul lippe)
play “whack-a-mole”, reacting to
problems by creating fear and
friction within organizations and
the impression that there...
can help clients shape
(perhaps distort)
external perception of risk.
Merely Clever Lawyers
design systems that
balance risk and improve
transparency, helping clients
correctly price risk internally
Great Lawyers
On Background
Associate Professor of Law
IllinoisTech - Chicago Kent
Affiliated Faculty
Stanford CodeX
Center for Legal Informatics
Colle...
Fellow
Stanford CodeX
Center for Legal Informatics
Adjunct Professor
University of Michigan
Center for Study of Complex Sy...
Chief Strategy Officer
LexPredict
Chief Executive Officer
LexPredict
computationallegalstudies.com
Our
Blog
(since 2009)
@ computational
We are
#LegalInformatics
Researchers
Quantitative Legal Prediction
- or -
How I Learned to Stop Worrying and Start
Preparing for the Data Driven Future of the
...
The United States Tax Court
Cases and Dockets
#TaxLitigation
Measuring the Complexity of the Law:
The United States Code


Daniel Martin Katz, Joshua Gubler, Jon Zelner, Michael Bommarito, Eric Provins
& Eitan Ingall, Reproduction of Hierarchy...
Legal Language Explorer
Indexing 450,000+ Cases
#FreeTheLaw
#OpenSource
#ManagingFinancialRisk
Black
Reed
Frankfurter
Douglas
Jackson
Burton
Clark
Minton
Warren
Harlan
Brennan
Whittaker
Stewart
White
Goldberg
Fortas
M...
Acyclic digraphs arise in many natural and artificial processes. Among the
broader set, dynamic citation networks represent...
(2017 Forthcoming)
Legal Informatics
Ron DolinDaniel Martin Katz Michael Bommarito
35+ Contributors
(Edited Volume)
(Katz,...
#STEM + #LAW = Law’s Future
Daniel Martin Katz, The MIT
School of Law? A Perspective on
Legal Education in the 21st
Century, University of Illinois
La...
TheLawLab.com
Legal
Tech +
Innovation
Certificate
Quantitative Methods for Lawyers
http://www.quantitativemethodsclass.com/Professor Daniel Martin Katz
Intro Class
Legal Analytics
Professor Daniel Martin Katz
Professor Michael J Bommarito II
http://www.legalanalyticscourse.com/Professor Daniel Martin Katz
Professor Michael J. Bommarito II Advanced Class
The Age of
Data
Driven
Law
Practice
It Has Already Begun ...
implication is that every
organization needs a
data strategy
(including law firms & inside counsel)
Some Examples
The Age of
Quantitative Legal Prediction
The Age of
Quantitative Legal Prediction
The Age of
Quantitative Legal Prediction
The Age of
Quantitative Legal Prediction
Quantitative Legal Prediction
- or -
How I Learned to Stop Worrying and Start
Preparing for the Data Driven Future of the
...
Today we are going to
talk about one key
idea in prediction
There are 3 Known Ways
to Predict Something
Experts, Crowds, Algorithms
We could apply this to a
wide range of problems
For today we will apply
these approaches to the
decisions of the
Supreme Court of United States
Every year, law reviews, magazine and
newspaper articles, television and radio
time, conference panels, blog posts, and
tw...
There are only 3 ways 

to predict something
Experts
Crowds
Algorithms
Experts
Columbia Law Review
October, 2004
Theodore W. Ruger, Pauline T. Kim,
Andrew D. Martin, Kevin M. Quinn
Legal and Political ...
experts
Case Level Prediction
Justice Level Prediction
67.4% experts
58% experts
From the 68
Included
Cases
for the
2002-2003
Supr...
these experts probably
overfit
they fit to the noise
and
not the signal
we need to
evaluate
experts and
somehow
benchmark
their
expertise
from a pure
forecasting
standpoint
the best
known
SCOTUS
predictor is
the law
version of
superforecasting
Crowds
crowds
https://fantasyscotus.lexpredict.com/case/list/
We can
generate
Crowd
Sourced
Predictions
however,
not all
members of
crowd are
made equal
we maintain
a ‘supercrowd’
which is the top n%
of predictors
up to time t
the
‘supercrowd’
outperforms
the overall
crowd
(and the best
single player)
(performance for the 2015 - 2016 term)
not
enough
crowd
based
decision
making in
(legal)
institutions
“Software developers were asked on two
separate days to estimate the completion
time for a given task, the hours they
proj...
in law
here
is our
commercial
offering
design
to
unlock
untapped
expertise
in
organizations
#Winning
Allowing
for
Frictionless
Crowdsourcing
#ManualUnderwriting
https://lexsemble.com/
https://lexsemble.com/
Algorithms
Black
Reed
Frankfurter
Douglas
Jackson
Burton
Clark
Minton
Warren
Harlan
Brennan
Whittaker
Stewart
White
Goldberg
Fortas
M...
Our approach is a special version
of random forest
Black
Reed
Frankfurter
Douglas
Jackson
Burton
Clark
Minton
Warren
Harla...
we have developed an
algorithm that we call
{Marshall}+
random forest
Benchmarking
since 1953
+
Using only data
available prior to
the decision
Mean Court Direction [FE]
Mean Court Direction 1...
Total Cases Predicted
Total Votes Predicted
7,700
68,964
Justice Prediction
Case Prediction
70.9% accuracy
69.6% accuracy
From 1953 - 2014
Version 2.0 is 1791 - 2015
Small Taste of How
Our Algorithm Works …
Breiman (1984) sets forth
the CART algorithm
Columbia Law Review
October, 2004
Theodore W. Ruger, Pauline T. Kim,
Andrew D. Martin, Kevin M. Quinn
Legal and Political ...
Given Some Data:
(X1, Y1), ... , (Xn, Yn)
Now We Have a New Set of X’s
We Want to Predict the Y
Form a BinaryTree that
Minimizes the Error
in each leaf of the tree
CART
(Classification & RegressionTrees)
Observe the Correspondence
Between the Data andTrees
1
0
1
1
1
0
0
0
0
0
1
1 1
1
0
0
1
1
1
1
0
01
0
Xi1
Xi2
0
Adapted from Example
By Mathematical Monk
1
0
1
1
1
0
0
0
0
0
1
1 1
1
0
0
1
1
1
1
0
01
0
Xi1
Xi2
0
Adapted from Example
By Mathematical Monk
We want to build an
app...
1
0
1
1
1
0
0
0
0
0
1
1 1
1
0
0
1
1
1
1
0
01
0
Xi1
Xi2
0
Adapted from Example
By Mathematical Monk
1
0
1
1
1
0
0
0
0
0
1
1 1
1
0
0
1
1
1
1
0
01
0
Xi1
Xi2
0
Adapted from Example
By Mathematical Monk
L e t s B e g i n t o
P...
1
0
1
1
1
0
0
0
0
0
1
1 1
1
0
0
1
1
1
1
0
01
0
Xi1
Xi2
0
1 2
1
2
Adapted from Example
By Mathematical Monk
L e t s B e g i...
1
0
1
1
1
0
0
0
0
0
1
1 1
1
0
0
1
1
1
1
0
01
0
Xi1
Xi2
0
1 2
1
2
Adapted from Example
By Mathematical Monk
This Split Will...
1
0
1
1
1
0
0
0
0
0
1
1 1
1
0
0
1
1
1
1
0
01
0
Xi1
Xi2
0
1 2
1
2
Adapted from Example
By Mathematical Monk
We Ask the Ques...
1
0
1
1
1
0
0
0
0
0
1
1 1
1
0
0
1
1
1
1
0
01
0
Xi1
Xi2
0
1 2
1
2
Adapted from Example
By Mathematical Monk
If No - then we...
1
0
1
1
1
0
0
0
0
0
1
1 1
1
0
0
1
1
1
1
0
01
0
Xi1
Xi2
0
1 2
1
2
Adapted from Example
By Mathematical Monk
Here we Classif...
1
0
1
1
1
0
0
0
0
0
1
1 1
1
0
0
1
1
1
1
0
01
0
Xi1
Xi2
0
1 2
1
2
Adapted from Example
By Mathematical Monk
Using a Similar...
1
0
1
1
1
0
0
0
0
0
1
1 1
1
0
0
1
1
1
1
0
01
0
Xi1
Xi2
0
1 2
1
2
Adapted from Example
By Mathematical Monk
split 1
(a)
Xi1...
1
0
1
1
1
0
0
0
0
0
1
1 1
1
0
0
1
1
1
1
0
01
0
Xi1
Xi2
0split 1
split 2
split 3
1 2 2.2
1
2
Xi1 > 1 ?
(0,5)
Xi2 > 1.45 ?
(...
1
0
1
1
1
0
0
0
0
0
1
1 1
1
0
0
1
1
1
1
0
01
0
Xi1
Xi2
0split 1
split 2
split 3
split 4
1 2 2.2
1
2
Xi1 > 1 ?
(0,5)
Xi2 > ...
Okay Lets Add Back the ( )
which are new items
to be classified
For simplicity sake there
is one in each zone
We Will Use theTree Because
theTree Is Our Prediction Machine
1
0
1
1
1
0
0
0
0
0
1
1 1
1
0
0
1
1
1
1
0
01
0
Xi1
Xi2
0split 1
split 2
split 3
split 4
1 2 2.2
1
2
Xi1 > 1 ?
(0,5)
Xi2 > ...
1
0
1
1
1
0
0
0
0
0
1
1 1
1
0
0
1
1
1
1
0
01
0
Xi1
Xi2
0split 1
split 2
split 3
split 4
1 2 2.2
1
2
Xi1 > 1 ?
(0,5)
Xi2 > ...
1
0
1
1
1
0
0
0
0
0
1
1 1
1
0
0
1
1
1
1
0
01
0
Xi1
Xi2
0split 1
split 2
split 3
split 4
1 2 2.2
1
2
Xi1 > 1 ?
(0,5)
Xi2 > ...
Experts, Crowds, Algorithms
For most problems ...
ensembles of these streams
outperform any single stream
Humans
+
Machines
Humans
+
Machines
>
Humans
+
Machines
Humans
or
Machines
>
Ensembles come in
various forms
Here is a well known example
Poll Aggregation is one form of
ensemble where the learning question is
to determine how much weight (if any)
to assign to...
poll weighting
A Visual Depiction of
How to build an
ensemble method in our
judicial prediction example
expert crowd algorithm
ensemble method
learning problem is to discover when to use a given stream of intelligence
expert crowd algorithm
via back testing we can learn the
weights to apply for particular problems
ensemble method
learning...
{Marshall}+
algorithm
expert
crowd
algorithm
{Marshall}+ improvement
will likely come from
determining the optimal
weighting of experts,
crowds and algorithms
for vari...
ERISA cases
thus
might look like this
Patent cases
perhaps
might look like this
Search/Seizure cases
while
could look like this
this is one slice of our
research effort …
Given our ability to offer
forecasts of judicial
outcomes, we wondered
if this information could
inform an event-driven
tr...
Paper Released
August 24, 2015
http://arxiv.org/abs/1508.05751
available at
http://papers.ssrn.com/sol3/papers.cfm?abstrac...
We call this idea
“Law on the Market”
(LOTM)
A Motivating Example
Myriad Genetics
NASDAQ: MYGN
Market Cap of ~$3 billion+
Myriad Genetics
“Myriad employs a number of proprietary
technologies that permit doctors and patients
to understand the ge...
Myriad Genetics
“Myriad was the subject of scrutiny
after it became involved in a lengthy
lawsuit over its controversial p...
June 13, 2013
Supreme Court
offers this
decision
~10:05am
Initial Media
Reports and
Initial Trading
11:48am
Initial Media
Reports
Early
Afternoon
“In early afternoon trading
Thursday, Myriad shares
were up 5.4 percent, or
$2.36, a...
Final Media
Reports
Final Media
Reports
-0.050
-0.025
0.000
0.025
AverageCumulativeAbnormalReturns
NASDAQ: MYGN
Pegged to S&P 500
(Market Model)
June 12
-0.050
-0.025
0.000
0.025
AverageCumulativeAbnormalReturns
NASDAQ: MYGN
Pegged to S&P 500
(Market Model)
June 12
9:30am
-0.050
-0.025
0.000
0.025
AverageCumulativeAbnormalReturns
10:00am ET
NASDAQ: MYGN
Pegged to S&P 500
(Market Model)
June 1...
-0.050
-0.025
0.000
0.025
AverageCumulativeAbnormalReturns
10:00am ET
11:40am ET
NASDAQ: MYGN
Pegged to S&P 500
(Market Mo...
-0.050
-0.025
0.000
0.025
AverageCumulativeAbnormalReturns
10:00am ET
1:20pm ET
11:40am ET
NASDAQ: MYGN
Pegged to S&P 500
...
-0.050
-0.025
0.000
0.025
AverageCumulativeAbnormalReturns
10:00am ET
1:20pm ET
11:40am ET
NASDAQ: MYGN
Pegged to S&P 500
...
-0.050
-0.025
0.000
0.025
AverageCumulativeAbnormalReturns
10:00am ET
1:20pm ET
11:40am ET
NASDAQ: MYGN
Pegged to S&P 500
...
Paper Released
August 24, 2015
lots of litigation decisions
are just a version of this basic idea
law = finance
this is a part of the
industry where you
need rigorous
#LegalUnderwriting
but lots of litigation decisions
are actually implicit litigation finance
(or self insurance)
#fin(legal)tech
however most implicit litigation
finance is not based upon 

rigorous underwriting …
law =! finance
(but it will)
we expand on this theme in this presentation
http://computationallegalstudies.com/2015/10/fin-legal-tech-laws-future-from-fi...
TheLawLab.com
FinLegalTechConference.comNovember 4, 2016
A Few Plugs …
LexPredict.com
ComputationalLegalStudies.com
BLOG
Michael J. Bommarito II
@ mjbommar
computationallegalstudies.com
lexpredict.com
bommaritollc.com
university of michigan ce...
Daniel Martin Katz
@ computational
computationallegalstudies.com
lexpredict.com
danielmartinkatz.com
illinois tech - chica...
The Three Forms of (Legal) Prediction: Experts, Crowds and Algorithms -- Professors Daniel Martin Katz & Michael J. Bommar...
The Three Forms of (Legal) Prediction: Experts, Crowds and Algorithms -- Professors Daniel Martin Katz & Michael J. Bommar...
The Three Forms of (Legal) Prediction: Experts, Crowds and Algorithms -- Professors Daniel Martin Katz & Michael J. Bommar...
The Three Forms of (Legal) Prediction: Experts, Crowds and Algorithms -- Professors Daniel Martin Katz & Michael J. Bommar...
The Three Forms of (Legal) Prediction: Experts, Crowds and Algorithms -- Professors Daniel Martin Katz & Michael J. Bommar...
The Three Forms of (Legal) Prediction: Experts, Crowds and Algorithms -- Professors Daniel Martin Katz & Michael J. Bommar...
The Three Forms of (Legal) Prediction: Experts, Crowds and Algorithms -- Professors Daniel Martin Katz & Michael J. Bommar...
The Three Forms of (Legal) Prediction: Experts, Crowds and Algorithms -- Professors Daniel Martin Katz & Michael J. Bommar...
The Three Forms of (Legal) Prediction: Experts, Crowds and Algorithms -- Professors Daniel Martin Katz & Michael J. Bommar...
The Three Forms of (Legal) Prediction: Experts, Crowds and Algorithms -- Professors Daniel Martin Katz & Michael J. Bommar...
The Three Forms of (Legal) Prediction: Experts, Crowds and Algorithms -- Professors Daniel Martin Katz & Michael J. Bommar...
The Three Forms of (Legal) Prediction: Experts, Crowds and Algorithms -- Professors Daniel Martin Katz & Michael J. Bommar...
The Three Forms of (Legal) Prediction: Experts, Crowds and Algorithms -- Professors Daniel Martin Katz & Michael J. Bommar...
The Three Forms of (Legal) Prediction: Experts, Crowds and Algorithms -- Professors Daniel Martin Katz & Michael J. Bommar...
The Three Forms of (Legal) Prediction: Experts, Crowds and Algorithms -- Professors Daniel Martin Katz & Michael J. Bommar...
The Three Forms of (Legal) Prediction: Experts, Crowds and Algorithms -- Professors Daniel Martin Katz & Michael J. Bommar...
The Three Forms of (Legal) Prediction: Experts, Crowds and Algorithms -- Professors Daniel Martin Katz & Michael J. Bommar...
The Three Forms of (Legal) Prediction: Experts, Crowds and Algorithms -- Professors Daniel Martin Katz & Michael J. Bommar...
The Three Forms of (Legal) Prediction: Experts, Crowds and Algorithms -- Professors Daniel Martin Katz & Michael J. Bommar...
The Three Forms of (Legal) Prediction: Experts, Crowds and Algorithms -- Professors Daniel Martin Katz & Michael J. Bommar...
The Three Forms of (Legal) Prediction: Experts, Crowds and Algorithms -- Professors Daniel Martin Katz & Michael J. Bommar...
The Three Forms of (Legal) Prediction: Experts, Crowds and Algorithms -- Professors Daniel Martin Katz & Michael J. Bommar...
The Three Forms of (Legal) Prediction: Experts, Crowds and Algorithms -- Professors Daniel Martin Katz & Michael J. Bommar...
The Three Forms of (Legal) Prediction: Experts, Crowds and Algorithms -- Professors Daniel Martin Katz & Michael J. Bommar...
The Three Forms of (Legal) Prediction: Experts, Crowds and Algorithms -- Professors Daniel Martin Katz & Michael J. Bommar...
The Three Forms of (Legal) Prediction: Experts, Crowds and Algorithms -- Professors Daniel Martin Katz & Michael J. Bommar...
The Three Forms of (Legal) Prediction: Experts, Crowds and Algorithms -- Professors Daniel Martin Katz & Michael J. Bommar...
The Three Forms of (Legal) Prediction: Experts, Crowds and Algorithms -- Professors Daniel Martin Katz & Michael J. Bommar...
The Three Forms of (Legal) Prediction: Experts, Crowds and Algorithms -- Professors Daniel Martin Katz & Michael J. Bommar...
The Three Forms of (Legal) Prediction: Experts, Crowds and Algorithms -- Professors Daniel Martin Katz & Michael J. Bommar...
The Three Forms of (Legal) Prediction: Experts, Crowds and Algorithms -- Professors Daniel Martin Katz & Michael J. Bommar...
The Three Forms of (Legal) Prediction: Experts, Crowds and Algorithms -- Professors Daniel Martin Katz & Michael J. Bommar...
The Three Forms of (Legal) Prediction: Experts, Crowds and Algorithms -- Professors Daniel Martin Katz & Michael J. Bommar...
The Three Forms of (Legal) Prediction: Experts, Crowds and Algorithms -- Professors Daniel Martin Katz & Michael J. Bommar...
The Three Forms of (Legal) Prediction: Experts, Crowds and Algorithms -- Professors Daniel Martin Katz & Michael J. Bommar...
Upcoming SlideShare
Loading in …5
×

The Three Forms of (Legal) Prediction: Experts, Crowds and Algorithms -- Professors Daniel Martin Katz & Michael J. Bommarito (Updated)

16,479 views

Published on

The Three Forms of (Legal) Prediction: Experts, Crowds and Algorithms -- Professors Daniel Martin Katz & Michael J. Bommarito - Illinois Tech Law / Univ of Michigan CSCS (Updated Version)

Published in: Law

The Three Forms of (Legal) Prediction: Experts, Crowds and Algorithms -- Professors Daniel Martin Katz & Michael J. Bommarito (Updated)

  1. 1. The Three Forms of (Legal) Prediction professor daniel martin katz home | Illinois tech - chicago kent blog | ComputationalLegalStudies corp | LexPredict experts, crowds & algorithms professor michael j bommarito
  2. 2. Three Types of Lawyers (as described by paul lippe)
  3. 3. play “whack-a-mole”, reacting to problems by creating fear and friction within organizations and the impression that there is a legal risk around every corner. Mediocre Lawyers
  4. 4. can help clients shape (perhaps distort) external perception of risk. Merely Clever Lawyers
  5. 5. design systems that balance risk and improve transparency, helping clients correctly price risk internally Great Lawyers
  6. 6. On Background
  7. 7. Associate Professor of Law IllinoisTech - Chicago Kent Affiliated Faculty Stanford CodeX Center for Legal Informatics College of Law
  8. 8. Fellow Stanford CodeX Center for Legal Informatics Adjunct Professor University of Michigan Center for Study of Complex Systems
  9. 9. Chief Strategy Officer LexPredict Chief Executive Officer LexPredict
  10. 10. computationallegalstudies.com Our Blog (since 2009)
  11. 11. @ computational
  12. 12. We are #LegalInformatics Researchers
  13. 13. Quantitative Legal Prediction - or - How I Learned to Stop Worrying and Start Preparing for the Data Driven Future of the Legal Services Industry Professor Daniel Martin Katz #LegalAnalyics #LegalData #LegalPrediction
  14. 14. The United States Tax Court Cases and Dockets #TaxLitigation
  15. 15. Measuring the Complexity of the Law: The United States Code
  16. 16. 
 Daniel Martin Katz, Joshua Gubler, Jon Zelner, Michael Bommarito, Eric Provins & Eitan Ingall, Reproduction of Hierarchy? A Social Network Analysis of the American Law Professoriate, 61 Journal of Legal Education 76 (2011)
  17. 17. Legal Language Explorer Indexing 450,000+ Cases
  18. 18. #FreeTheLaw #OpenSource
  19. 19. #ManagingFinancialRisk
  20. 20. Black Reed Frankfurter Douglas Jackson Burton Clark Minton Warren Harlan Brennan Whittaker Stewart White Goldberg Fortas Marshall Burger Blackmun Powell Rehnquist Stevens OConnor Scalia Kennedy Souter Thomas Ginsburg Breyer Roberts Alito Sotomayor Kagan 1953 1963 1973 1983 1993 2003 2013 9-0 Reverse 8-1, 7-2, 6-3 19 19 19 19 19 20 20 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 - Reverse 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 - 8-1, 7-2, 6-3 9-0 19 19 19 19 19 20 20 http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2463244 http://arxiv.org/abs/1407.6333 available at Revise and Resubmit @ PloS One #JudicialPrediction #PredictingLegalOutcomes
  21. 21. Acyclic digraphs arise in many natural and artificial processes. Among the broader set, dynamic citation networks represent a substantively important form of acyclic digraphs. For example, the study of such networks includes the spread of ideas through academic citations, the spread of innovation through patent citations, and the development of precedent in common law systems.
  22. 22. (2017 Forthcoming) Legal Informatics Ron DolinDaniel Martin Katz Michael Bommarito 35+ Contributors (Edited Volume) (Katz, Dolin & Bommarito, Editors)
  23. 23. #STEM + #LAW = Law’s Future
  24. 24. Daniel Martin Katz, The MIT School of Law? A Perspective on Legal Education in the 21st Century, University of Illinois Law Review 1431 (2014) New York Times - August 1, 2014 
 Daniel Martin Katz, an associate professor with expertise in big data and powerful computing and their applications to legal studies. He hopes to give his students a leg up in a job market that seems increasingly bleak, and to help them become “T- shaped,” by which he means having deep knowledge — the downward swipe of the letter T — as well as a broadened set of abilities. So providing them with information on seemingly arcane subjects like data analytics can be a career builder. “Analytics plus law gets you into a niche,” he said.
  25. 25. TheLawLab.com
  26. 26. Legal Tech + Innovation Certificate
  27. 27. Quantitative Methods for Lawyers
  28. 28. http://www.quantitativemethodsclass.com/Professor Daniel Martin Katz Intro Class
  29. 29. Legal Analytics Professor Daniel Martin Katz Professor Michael J Bommarito II
  30. 30. http://www.legalanalyticscourse.com/Professor Daniel Martin Katz Professor Michael J. Bommarito II Advanced Class
  31. 31. The Age of Data Driven Law Practice
  32. 32. It Has Already Begun ...
  33. 33. implication is that every organization needs a data strategy (including law firms & inside counsel)
  34. 34. Some Examples
  35. 35. The Age of Quantitative Legal Prediction
  36. 36. The Age of Quantitative Legal Prediction
  37. 37. The Age of Quantitative Legal Prediction
  38. 38. The Age of Quantitative Legal Prediction
  39. 39. Quantitative Legal Prediction - or - How I Learned to Stop Worrying and Start Preparing for the Data Driven Future of the Legal Services Industry Professor Daniel Martin Katz
  40. 40. Today we are going to talk about one key idea in prediction
  41. 41. There are 3 Known Ways to Predict Something
  42. 42. Experts, Crowds, Algorithms
  43. 43. We could apply this to a wide range of problems
  44. 44. For today we will apply these approaches to the decisions of the Supreme Court of United States
  45. 45. Every year, law reviews, magazine and newspaper articles, television and radio time, conference panels, blog posts, and tweets are devoted to questions such as: How will the Court rule in particular cases?
  46. 46. There are only 3 ways 
 to predict something Experts Crowds Algorithms
  47. 47. Experts
  48. 48. Columbia Law Review October, 2004 Theodore W. Ruger, Pauline T. Kim, Andrew D. Martin, Kevin M. Quinn Legal and Political Science Approaches to Predicting Supreme Court Decision Making The Supreme Court Forecasting Project:
  49. 49. experts
  50. 50. Case Level Prediction Justice Level Prediction 67.4% experts 58% experts From the 68 Included Cases for the 2002-2003 Supreme Court Term
  51. 51. these experts probably overfit
  52. 52. they fit to the noise and not the signal
  53. 53. we need to evaluate experts and somehow benchmark their expertise
  54. 54. from a pure forecasting standpoint
  55. 55. the best known SCOTUS predictor is
  56. 56. the law version of superforecasting
  57. 57. Crowds
  58. 58. crowds
  59. 59. https://fantasyscotus.lexpredict.com/case/list/ We can generate Crowd Sourced Predictions
  60. 60. however, not all members of crowd are made equal
  61. 61. we maintain a ‘supercrowd’ which is the top n% of predictors up to time t
  62. 62. the ‘supercrowd’ outperforms the overall crowd (and the best single player)
  63. 63. (performance for the 2015 - 2016 term)
  64. 64. not enough crowd based decision making in (legal) institutions
  65. 65. “Software developers were asked on two separate days to estimate the completion time for a given task, the hours they projected differed by 71%, on average. When pathologists made two assessments of the severity of biopsy results, the correlation between their ratings was only .61 (out of a perfect 1.0), indicating that they made inconsistent diagnoses quite frequently. Judgments made by different people are even more likely to diverge.”
  66. 66. in law here is our commercial offering
  67. 67. design to unlock untapped expertise in organizations #Winning
  68. 68. Allowing for Frictionless Crowdsourcing #ManualUnderwriting
  69. 69. https://lexsemble.com/
  70. 70. https://lexsemble.com/
  71. 71. Algorithms
  72. 72. Black Reed Frankfurter Douglas Jackson Burton Clark Minton Warren Harlan Brennan Whittaker Stewart White Goldberg Fortas Marshall Burger Blackmun Powell Rehnquist Stevens OConnor Scalia Kennedy Souter Thomas Ginsburg Breyer Roberts Alito Sotomayor Kagan 1953 1963 1973 1983 1993 2003 2013 9-0 Reverse 8-1, 7-2, 6-3 19 19 19 19 19 20 20 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 - Reverse 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 - 8-1, 7-2, 6-3 9-0 19 19 19 19 19 20 20 algorithms Online Learning Model
  73. 73. Our approach is a special version of random forest Black Reed Frankfurter Douglas Jackson Burton Clark Minton Warren Harlan Brennan Whittaker Stewart White Goldberg Fortas Marshall Burger Blackmun Powell Rehnquist Stevens OConnor Scalia Kennedy Souter Thomas Ginsburg Breyer Roberts Alito Sotomayor Kagan 1953 1963 1973 1983 1993 2003 2013 9-0 Reverse 8-1, 7-2, 6-3 19 19 19 19 19 20 20 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 - Reverse 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 - 8-1, 7-2, 6-3 9-0 19 19 19 19 19 20 20 http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2463244 http://arxiv.org/abs/1407.6333 available at Revise and Resubmit @ PloS One
  74. 74. we have developed an algorithm that we call {Marshall}+ random forest
  75. 75. Benchmarking since 1953 + Using only data available prior to the decision Mean Court Direction [FE] Mean Court Direction 10 [FE] Mean Court Direction Issue [FE] Mean Court Direction Issue 10 [FE] Mean Court Direction Petitioner [FE] Mean Court Direction Petitioner 10 [FE] Mean Court Direction Respondent [FE] Mean Court Direction Respondent 10 [FE] Mean Court Direction Circuit Origin [FE] Mean Court Direction Circuit Origin 10 [FE] Mean Court Direction Circuit Source [FE] Mean Court Direction Circuit Source 10 [FE] Difference Justice Court Direction [FE] Abs. Difference Justice Court Direction [FE] Difference Justice Court Direction Issue [FE] Abs. Difference Justice Court Direction Issue [FE] Z Score Difference Justice Court Direction Issue [FE] Difference Justice Court Direction Petitioner [FE] Abs. Difference Justice Court Direction Petitioner [FE] Difference Justice Court Direction Respondent [FE] Abs. Difference Justice Court Direction Respondent [FE] Z Score Justice Court Direction Difference [FE] Justice Lower Court Direction Difference [FE] Justice Lower Court Direction Abs. Difference [FE] Justice Lower Court Direction Z Score [FE] Z Score Justice Lower Court Direction Difference [FE] Agreement of Justice with Majority [FE] Agreement of Justice with Majority 10 [FE] Difference Court and Lower Ct Direction [FE] Abs. Difference Court and Lower Ct Direction [FE] Z-Score Difference Court and Lower Ct Direction [FE] Z-Score Abs. Difference Court and Lower Ct Direction [FE] Justice [S] Justice Gender [FE] Is Chief [FE] Party President [FE] Natural Court [S] Segal Cover Score [SC] Year of Birth [FE] Mean Lower Court Direction Circuit Source [FE] Mean Lower Court Direction Circuit Source 10 [FE] Mean Lower Court Direction Issue [FE] Mean Lower Court Direction Issue 10 [FE] Mean Lower Court Direction Petitioner [FE] Mean Lower Court Direction Petitioner 10 [FE] Mean Lower Court Direction Respondent [FE] Mean Lower Court Direction Respondent 10 [FE] Mean Justice Direction [FE] Mean Justice Direction 10 [FE] Mean Justice Direction Z Score [FE] Mean Justice Direction Petitioner [FE] Mean Justice Direction Petitioner 10 [FE] Mean Justice Direction Respondent [FE] Mean Justice Direction Respondent 10 [FE] Mean Justice Direction for Circuit Origin [FE] Mean Justice Direction for Circuit Origin 10 [FE] Mean Justice Direction for Circuit Source [FE] Mean Justice Direction for Circuit Source 10 [FE] Mean Justice Direction by Issue [FE] Mean Justice Direction by Issue 10 [FE] Mean Justice Direction by Issue Z Score [FE] Admin Action [S] Case Origin [S] Case Origin Circuit [S] Case Source [S] Case Source Circuit [S] Law Type [S] Lower Court Disposition Direction [S] Lower Court Disposition [S] Lower Court Disagreement [S] Issue [S] Issue Area [S] Jurisdiction Manner [S] Month Argument [FE] Month Decision [FE] Petitioner [S] Petitioner Binned [FE] Respondent [S] Respondent Binned [FE] Cert Reason [S] Mean Agreement Level of Current Court [FE] Std. Dev. of Agreement Level of Current Court [FE] Mean Current Court Direction Circuit Origin [FE] Std. Dev. Current Court Direction Circuit Origin [FE] Mean Current Court Direction Circuit Source [FE] Std. Dev. Current Court Direction Circuit Source [FE] Mean Current Court Direction Issue [FE] Z-Score Current Court Direction Issue [FE] Std. Dev. Current Court Direction Issue [FE] Mean Current Court Direction [FE] Std. Dev. Current Court Direction [FE] Mean Current Court Direction Petitioner [FE] Std. Dev. Current Court Direction Petitioner [FE] Mean Current Court Direction Respondent [FE] Std. Dev. Current Court Direction Respondent [FE] 0.00781 0.00205 0.00283 0.00604 0.00764 0.00971 0.00793 TOTAL 0.04403 Justice and Court Background Information Case Information 0.00978 0.00971 0.00845 0.00953 0.01015 0.01370 0.01190 0.01125 0.00706 0.01541 0.01469 0.00595 0.02014 0.01349 0.01406 0.01199 0.01490 0.01179 0.01408 TOTAL 0.22814 Overall Historic Supreme Court Trends 0.00988 0.01997 0.01546 0.00938 0.00863 0.00904 0.00875 0.00925 0.00791 0.00864 0.00951 0.01017 TOTAL 0.12663 Lower Court Trends 0.00962 0.01017 0.01334 0.00933 0.00949 0.00874 0.00973 0.00900 TOTAL 0.07946 0.00955 0.00936 0.00789 0.00850 0.00945 0.01021 0.01469 0.00832 0.01266 0.00918 0.00942 0.00863 0.00894 0.00882 0.00888 Current Supreme Court Trends TOTAL 0.14456 Individual Supreme Court Justice Trends 0.01248 0.01530 0.00826 0.00732 0.01027 0.00724 0.01030 0.00792 0.00945 0.00891 0.00970 0.01881 0.00950 0.00771 TOTAL 0.14323 0.01210 0.00929 0.01167 0.00968 0.01055 0.00705 0.00708 0.00690 0.00699 0.01280 0.01922 0.02494 0.01126 0.00992 0.00866 0.01483 0.01522 0.01199 0.01217 0.01150 TOTAL 0.23391 Differences in Trends
  76. 76. Total Cases Predicted Total Votes Predicted 7,700 68,964
  77. 77. Justice Prediction Case Prediction 70.9% accuracy 69.6% accuracy From 1953 - 2014 Version 2.0 is 1791 - 2015
  78. 78. Small Taste of How Our Algorithm Works …
  79. 79. Breiman (1984) sets forth the CART algorithm
  80. 80. Columbia Law Review October, 2004 Theodore W. Ruger, Pauline T. Kim, Andrew D. Martin, Kevin M. Quinn Legal and Political Science Approaches to Predicting Supreme Court Decision Making The Supreme Court Forecasting Project:
  81. 81. Given Some Data: (X1, Y1), ... , (Xn, Yn) Now We Have a New Set of X’s We Want to Predict the Y
  82. 82. Form a BinaryTree that Minimizes the Error in each leaf of the tree CART (Classification & RegressionTrees)
  83. 83. Observe the Correspondence Between the Data andTrees
  84. 84. 1 0 1 1 1 0 0 0 0 0 1 1 1 1 0 0 1 1 1 1 0 01 0 Xi1 Xi2 0 Adapted from Example By Mathematical Monk
  85. 85. 1 0 1 1 1 0 0 0 0 0 1 1 1 1 0 0 1 1 1 1 0 01 0 Xi1 Xi2 0 Adapted from Example By Mathematical Monk We want to build an approach which can lead to the proper classification (labeling) of new data points ( ) that are dropped into this space
  86. 86. 1 0 1 1 1 0 0 0 0 0 1 1 1 1 0 0 1 1 1 1 0 01 0 Xi1 Xi2 0 Adapted from Example By Mathematical Monk
  87. 87. 1 0 1 1 1 0 0 0 0 0 1 1 1 1 0 0 1 1 1 1 0 01 0 Xi1 Xi2 0 Adapted from Example By Mathematical Monk L e t s B e g i n t o Partition the Space
  88. 88. 1 0 1 1 1 0 0 0 0 0 1 1 1 1 0 0 1 1 1 1 0 01 0 Xi1 Xi2 0 1 2 1 2 Adapted from Example By Mathematical Monk L e t s B e g i n t o Partition the Space split 1 (a)
  89. 89. 1 0 1 1 1 0 0 0 0 0 1 1 1 1 0 0 1 1 1 1 0 01 0 Xi1 Xi2 0 1 2 1 2 Adapted from Example By Mathematical Monk This Split Will Be Memorialized in theTree split 1 (a)
  90. 90. 1 0 1 1 1 0 0 0 0 0 1 1 1 1 0 0 1 1 1 1 0 01 0 Xi1 Xi2 0 1 2 1 2 Adapted from Example By Mathematical Monk We Ask the Question is Xi1 > 1 ? - with a binary (yes or no) response split 1 (a) Xi1 > 1 ? YesNo
  91. 91. 1 0 1 1 1 0 0 0 0 0 1 1 1 1 0 0 1 1 1 1 0 01 0 Xi1 Xi2 0 1 2 1 2 Adapted from Example By Mathematical Monk If No - then we are in zone (a) ... we tally the number of zeros and ones Using Majority Rule do we assign a classification to this rule this leaf split 1 (a) Xi1 > 1 ? YesNo (0,5) Classify as 1 zone (a)
  92. 92. 1 0 1 1 1 0 0 0 0 0 1 1 1 1 0 0 1 1 1 1 0 01 0 Xi1 Xi2 0 1 2 1 2 Adapted from Example By Mathematical Monk Here we Classify as a 1 because (0,5) which is 0 zero’s and 5 one’s split 1 (a) Xi1 > 1 ? YesNo (0,5) Classify as 1 zone (a)
  93. 93. 1 0 1 1 1 0 0 0 0 0 1 1 1 1 0 0 1 1 1 1 0 01 0 Xi1 Xi2 0 1 2 1 2 Adapted from Example By Mathematical Monk Using a Similar Approach Lets Begin to Fill in the Rest of theTree split 1 (a) Xi1 > 1 ? YesNo (0,5) Classify as 1 zone (a)
  94. 94. 1 0 1 1 1 0 0 0 0 0 1 1 1 1 0 0 1 1 1 1 0 01 0 Xi1 Xi2 0 1 2 1 2 Adapted from Example By Mathematical Monk split 1 (a) Xi1 > 1 ? YesNo (0,5) Classify as 1 zone (a) Xi2 > 1.45 ? No Yes split 2
  95. 95. 1 0 1 1 1 0 0 0 0 0 1 1 1 1 0 0 1 1 1 1 0 01 0 Xi1 Xi2 0split 1 split 2 split 3 1 2 2.2 1 2 Xi1 > 1 ? (0,5) Xi2 > 1.45 ? (4,1)(2,3) Classify as 1 Classify as 1 Classify as 0 (a) zone (a) 1.45 YesNo Adapted from Example By Mathematical Monk No (b) (c) zone (b) zone (c) YesNo Yes Xi1 > 2 ?
  96. 96. 1 0 1 1 1 0 0 0 0 0 1 1 1 1 0 0 1 1 1 1 0 01 0 Xi1 Xi2 0split 1 split 2 split 3 split 4 1 2 2.2 1 2 Xi1 > 1 ? (0,5) Xi2 > 1.45 ? Xi1 > 2.2 ? (1,4)(5,0)(4,1)(2,3) Classify as 1 Classify as 1 Classify as 0 (a) zone (a) 1.45 YesNo Adapted from Example By Mathematical Monk No (b) (c) (d) (e) zone (b) zone (c) YesNo YesNo Yes zone (d) Classify as 0 Classify as 1 zone (e) Xi1 > 2 ?
  97. 97. Okay Lets Add Back the ( ) which are new items to be classified
  98. 98. For simplicity sake there is one in each zone
  99. 99. We Will Use theTree Because theTree Is Our Prediction Machine
  100. 100. 1 0 1 1 1 0 0 0 0 0 1 1 1 1 0 0 1 1 1 1 0 01 0 Xi1 Xi2 0split 1 split 2 split 3 split 4 1 2 2.2 1 2 Xi1 > 1 ? (0,5) Xi2 > 1.45 ? Xi1 > 2.2 ? (1,4)(5,0)(4,1)(2,3) Classify as 1 Classify as 1 Classify as 0 (a) zone (a) 1.45 YesNo Adapted from Example By Mathematical Monk No (b) (c) (d) (e) zone (b) zone (c) YesNo YesNo Yes zone (d) Classify as 0 Classify as 1 zone (e) Xi1 > 2 ?
  101. 101. 1 0 1 1 1 0 0 0 0 0 1 1 1 1 0 0 1 1 1 1 0 01 0 Xi1 Xi2 0split 1 split 2 split 3 split 4 1 2 2.2 1 2 Xi1 > 1 ? (0,5) Xi2 > 1.45 ? Xi1 > 2.2 ? (1,4)(5,0)(4,1)(2,3) Classify as 1 Classify as 1 Classify as 0 (a) zone (a) 1.45 YesNo Adapted from Example By Mathematical Monk No (b) (c) (d) (e) zone (b) zone (c) No Yes YesNo Yes zone (d) Classify as 0 Classify as 1 zone (e) Xi1 > 2 ?
  102. 102. 1 0 1 1 1 0 0 0 0 0 1 1 1 1 0 0 1 1 1 1 0 01 0 Xi1 Xi2 0split 1 split 2 split 3 split 4 1 2 2.2 1 2 Xi1 > 1 ? (0,5) Xi2 > 1.45 ? Xi1 > 2.2 ? (1,4)(5,0)(4,1)(2,3) Classify as 1 Classify as 1 Classify as 0 (a) zone (a) 1.45 YesNo Adapted from Example By Mathematical Monk No (b) (c) (d) (e) zone (b) zone (c) No Yes YesNo Yes zone (d) Classify as 0 Classify as 1 zone (e) 1 1 1 0 1 0 Xi1 > 2 ?
  103. 103. Experts, Crowds, Algorithms
  104. 104. For most problems ... ensembles of these streams outperform any single stream
  105. 105. Humans + Machines
  106. 106. Humans + Machines >
  107. 107. Humans + Machines Humans or Machines >
  108. 108. Ensembles come in various forms
  109. 109. Here is a well known example
  110. 110. Poll Aggregation is one form of ensemble where the learning question is to determine how much weight (if any) to assign to each individual poll
  111. 111. poll weighting
  112. 112. A Visual Depiction of How to build an ensemble method in our judicial prediction example
  113. 113. expert crowd algorithm ensemble method learning problem is to discover when to use a given stream of intelligence
  114. 114. expert crowd algorithm via back testing we can learn the weights to apply for particular problems ensemble method learning problem is to discover when to use a given stream of intelligence
  115. 115. {Marshall}+ algorithm
  116. 116. expert crowd algorithm
  117. 117. {Marshall}+ improvement will likely come from determining the optimal weighting of experts, crowds and algorithms for various types of cases
  118. 118. ERISA cases thus might look like this
  119. 119. Patent cases perhaps might look like this
  120. 120. Search/Seizure cases while could look like this
  121. 121. this is one slice of our research effort …
  122. 122. Given our ability to offer forecasts of judicial outcomes, we wondered if this information could inform an event-driven trading strategy ?
  123. 123. Paper Released August 24, 2015 http://arxiv.org/abs/1508.05751 available at http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2649726
  124. 124. We call this idea “Law on the Market” (LOTM)
  125. 125. A Motivating Example Myriad Genetics NASDAQ: MYGN Market Cap of ~$3 billion+
  126. 126. Myriad Genetics “Myriad employs a number of proprietary technologies that permit doctors and patients to understand the genetic basis of human disease and the role that genes play in the onset, progression and treatment of disease.”
  127. 127. Myriad Genetics “Myriad was the subject of scrutiny after it became involved in a lengthy lawsuit over its controversial patenting practices” which including the patenting of human gene sequences ....
  128. 128. June 13, 2013 Supreme Court offers this decision ~10:05am
  129. 129. Initial Media Reports and Initial Trading 11:48am
  130. 130. Initial Media Reports Early Afternoon “In early afternoon trading Thursday, Myriad shares were up 5.4 percent, or $2.36, at $35.73.”
  131. 131. Final Media Reports
  132. 132. Final Media Reports
  133. 133. -0.050 -0.025 0.000 0.025 AverageCumulativeAbnormalReturns NASDAQ: MYGN Pegged to S&P 500 (Market Model) June 12
  134. 134. -0.050 -0.025 0.000 0.025 AverageCumulativeAbnormalReturns NASDAQ: MYGN Pegged to S&P 500 (Market Model) June 12 9:30am
  135. 135. -0.050 -0.025 0.000 0.025 AverageCumulativeAbnormalReturns 10:00am ET NASDAQ: MYGN Pegged to S&P 500 (Market Model) June 12 9:30am June 13
  136. 136. -0.050 -0.025 0.000 0.025 AverageCumulativeAbnormalReturns 10:00am ET 11:40am ET NASDAQ: MYGN Pegged to S&P 500 (Market Model) June 13June 12 9:30am
  137. 137. -0.050 -0.025 0.000 0.025 AverageCumulativeAbnormalReturns 10:00am ET 1:20pm ET 11:40am ET NASDAQ: MYGN Pegged to S&P 500 (Market Model) June 13June 12 9:30am
  138. 138. -0.050 -0.025 0.000 0.025 AverageCumulativeAbnormalReturns 10:00am ET 1:20pm ET 11:40am ET NASDAQ: MYGN Pegged to S&P 500 (Market Model) June 13June 12 9:30am 2:15pm ET
  139. 139. -0.050 -0.025 0.000 0.025 AverageCumulativeAbnormalReturns 10:00am ET 1:20pm ET 11:40am ET NASDAQ: MYGN Pegged to S&P 500 (Market Model) June 13June 12 9:30am 2:15pm ET Close
  140. 140. Paper Released August 24, 2015
  141. 141. lots of litigation decisions are just a version of this basic idea law = finance
  142. 142. this is a part of the industry where you need rigorous #LegalUnderwriting
  143. 143. but lots of litigation decisions are actually implicit litigation finance (or self insurance) #fin(legal)tech
  144. 144. however most implicit litigation finance is not based upon 
 rigorous underwriting … law =! finance (but it will)
  145. 145. we expand on this theme in this presentation http://computationallegalstudies.com/2015/10/fin-legal-tech-laws-future-from-finances-past-katz-bommartio/
  146. 146. TheLawLab.com
  147. 147. FinLegalTechConference.comNovember 4, 2016
  148. 148. A Few Plugs …
  149. 149. LexPredict.com
  150. 150. ComputationalLegalStudies.com BLOG
  151. 151. Michael J. Bommarito II @ mjbommar computationallegalstudies.com lexpredict.com bommaritollc.com university of michigan center for the study of complex systems@
  152. 152. Daniel Martin Katz @ computational computationallegalstudies.com lexpredict.com danielmartinkatz.com illinois tech - chicago kent college of law@

×