Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Legal Analytics versus Empirical Legal Studies - or - Causal Inference vs Prediction Redux

2,678 views

Published on

Legal Analytics versus Empirical Legal Studies - or - Causal Inference vs Prediction Redux

Published in: Law
  • Hello! Get Your Professional Job-Winning Resume Here - Check our website! https://vk.cc/818RFv
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Legal Analytics versus Empirical Legal Studies - or - Causal Inference vs Prediction Redux

  1. 1. legal analytics vs. empirical legal studies daniel martin katz blog | ComputationalLegalStudies.com corp | LexPredict.com page | DanielMartinKatz.com edu | chicago-kent college of law lab | theLawLab.com -or- causal inference vs prediction redux
  2. 2. ELS and Legal Analytics Never the Twain Shall Meet? Partners in the Same Pursuit -OR-
  3. 3. I thought I might offer a quick landscape orientation regarding terminology, methods, etc.
  4. 4. The ‘Empirical Turn’ in Legal Scholarship
  5. 5. Legal Scholarship Has become far more ‘empirical’ in nature
  6. 6. Goal: Develop optimal (better) legal rules for various areas of human endeavor
  7. 7. Tools: Use Traditional Social Science Methods Instrumental Variables, Propensity Score Matching, Rubin Causal Model, Regression Discontinuity, Difference in Differences, etc. (typically econometric) tools
  8. 8. Outcome: Determine (as best possible) whether a particular policy intervention achieves the desired ends
  9. 9. relative to the alternative …
  10. 10. This represents a material improvement in the state of affairs …
  11. 11. The Diversity of Tasks that Lawyers Undertake
  12. 12. There are a diverse set of tasks which lawyers undertake …
  13. 13. Lawyer as Policy Maker, Appellate Judge Lets us divide the space Lawyer as Strategist, Predictor, Master of Process
  14. 14. Lawyer as Policy Maker, Appellate Judge
  15. 15. Causal Inference is at the core of the ‘empirical turn’ that has taken hold in law as well as the social sciences
  16. 16. Such Approaches are best for Appropriate Problems / Questions where identifying / linking cause and effect are key
  17. 17. Some Epistemological Issues / Questions
  18. 18. Some would make the epistemological / methodological case that prediction > causal inference
  19. 19. Part of that case comes from finance, trading, etc. where causal inference tools are generally not used
  20. 20. A Useful Quote for Your Consideration …
  21. 21. Andrew D. Martin, Kevin M. Quinn, Theodore W. Ruger & Pauline T. Kim, Competing Approaches to Predicting Supreme Court Decision Making, 2 Perspectives on Politics 761 (2004). “the best test of an explanatory theory is its ability to predict future events. To the extent that scholars in both disciplines (social science and law) seek to explain court behavior, they ought to test their theories not only against cases already decided, but against future outcomes as well.”
  22. 22. Andrew D. Martin, Kevin M. Quinn, Theodore W. Ruger & Pauline T. Kim, Competing Approaches to Predicting Supreme Court Decision Making, 2 Perspectives on Politics 761 (2004). “the best test of an explanatory theory is its ability to predict future events. To the extent that scholars in both disciplines (social science and law) seek to explain court behavior, they ought to test their theories not only against cases already decided, but against future outcomes as well.”
  23. 23. Andrew D. Martin, Kevin M. Quinn, Theodore W. Ruger & Pauline T. Kim, Competing Approaches to Predicting Supreme Court Decision Making, 2 Perspectives on Politics 761 (2004). “the best test of an explanatory theory is its ability to predict future events. To the extent that scholars in both disciplines (social science and law) seek to explain court behavior behavior, they ought to test their theories not only against cases outcomes already decided, but against future outcomes as well.”
  24. 24. Other folks are starting to ask similar questions …
  25. 25. So I believe that we will see more efforts in the coming years to do both backward and ‘forward causal inference’ in the policy sphere
  26. 26. The Other Type of Work That Lawyers Do
  27. 27. Lawyer as Policy Maker, Appellate Judge Lets us divide the space Lawyer as Strategist, Predictor, Master of Process
  28. 28. Lawyer as Strategist, Predictor, Master of Process
  29. 29. This version of the lawyer taskset is often directed at trying to forecast / predict future events
  30. 30. When you hear prediction you should think … #AI #LegalTech #Machine Learning #LegalAnalytics
  31. 31. Goal: Predict the behavior of some form of legal, regulatory institution
  32. 32. Tools: Use Some Blend of Experts, Crowds, Algorithms to Forecast Outcomes Craft Optimal Strategies, etc.
  33. 33. Quantitative Legal Prediction
  34. 34. Historically speaking, there were practically zero papers in law that used any form of machine learning
  35. 35. There has been growing interest in
  36. 36. rigorous There has been growing interest in
  37. 37. rigorous There has been growing interest in out of sample
  38. 38. rigorous There has been growing interest in prediction in law out of sample
  39. 39. rigorous #AI #LegalTech #Machine Learning #LegalAnalytics There has been growing interest in prediction in law out of sample
  40. 40. http://journals.plos.org/ plosone/article?id=10.1371/ journal.pone.0174698 available at RESEARCH ARTICLE A general approach for predicting the behavior of the Supreme Court of the United States Daniel Martin Katz1,2 *, Michael J. Bommarito II1,2 , Josh Blackman3 1 Illinois Tech - Chicago-Kent College of Law, Chicago, IL, United States of America, 2 CodeX - The Stanford Center for Legal Informatics, Stanford, CA, United States of America, 3 South Texas College of Law Houston, Houston, TX, United States of America * dkatz3@kentlaw.iit.edu Abstract Building on developments in machine learning and prior work in the science of judicial pre- diction, we construct a model designed to predict the behavior of the Supreme Court of the United States in a generalized, out-of-sample context. To do so, we develop a time-evolving random forest classifier that leverages unique feature engineering to predict more than 240,000 justice votes and 28,000 cases outcomes over nearly two centuries (1816-2015). Using only data available prior to decision, our model outperforms null (baseline) models at both the justice and case level under both parametric and non-parametric tests. Over nearly two centuries, we achieve 70.2% accuracy at the case outcome level and 71.9% at the jus- tice vote level. More recently, over the past century, we outperform an in-sample optimized null model by nearly 5%. Our performance is consistent with, and improves on the general level of prediction demonstrated by prior work; however, our model is distinctive because it can be applied out-of-sample to the entire past and future of the Court, not a single term. Our results represent an important advance for the science of quantitative legal prediction and portend a range of other potential applications. Introduction As the leaves begin to fall each October, the first Monday marks the beginning of another term for the Supreme Court of the United States. Each term brings with it a series of challenging, important cases that cover legal questions as diverse as tax law, freedom of speech, patent law, administrative law, equal protection, and environmental law. In many instances, the Court’s decisions are meaningful not just for the litigants per se, but for society as a whole. Unsurprisingly, predicting the behavior of the Court is one of the great pastimes for legal and political observers. Every year, newspapers, television and radio pundits, academic jour- nals, law reviews, magazines, blogs, and tweets predict how the Court will rule in a particular case. Will the Justices vote based on the political preferences of the President who appointed them or form a coalition along other dimensions? Will the Court counter expectations with an unexpected ruling? PLOS ONE | https://doi.org/10.1371/journal.pone.0174698 April 12, 2017 1 / 18 a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 OPEN ACCESS Citation: Katz DM, Bommarito MJ, II, Blackman J (2017) A general approach for predicting the behavior of the Supreme Court of the United States. PLoS ONE 12(4): e0174698. https://doi. org/10.1371/journal.pone.0174698 Editor: Luı´s A. Nunes Amaral, Northwestern University, UNITED STATES Received: January 17, 2017 Accepted: March 13, 2017 Published: April 12, 2017 Copyright: © 2017 Katz et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Data Availability Statement: Data and replication code are available on Github at the following URL: https://github.com/mjbommar/scotus-predict-v2/. Funding: The author(s) received no specific funding for this work. Competing interests: All Authors are Members of a LexPredict, LLC which provides consulting services to various legal industry stakeholders. We received no financial contributions from LexPredict or anyone else for this paper. This does not alter our adherence to PLOS ONE policies on sharing data and materials.
  41. 41. conducting analysis of legal system(s) at scale There has been growing interest in
  42. 42. “…I study choice of law by analyzing the nearly 1,000,000 contracts that have been disclosed to the Securities and Exchange Commission between 1996–2012.”
  43. 43. In this paper, we analyze over 4.5 million references to U.S. Federal Acts and Agencies contained within these 10-K reports to build a mean-field measurement of temperature and diversity in this regulatory ecosystem
  44. 44. There has also been a significant amount of commercial interest linked to legal analytics
  45. 45. For example, here are just a few predictions that lawyers are trying to accomplish on a daily basis
  46. 46. #Predict Relevant Documents Data Driven EDiscovery/Due Diligence (Predictive Coding)
  47. 47. #Predict Relevant Documents Data Driven EDiscovery/Due Diligence (Predictive Coding) #Predict Contract Terms/Outcomes Data Driven Transactional Work
  48. 48. #Predict Relevant Documents Data Driven EDiscovery/Due Diligence (Predictive Coding) Data Driven Compliance #Predict Contract Terms/Outcomes Data Driven Transactional Work #Predict Rogue Behavior
  49. 49. #Predict Relevant Documents #Predict Case Outcomes Data Driven Legal Underwriting Data Driven EDiscovery/Due Diligence (Predictive Coding) #Predict Rogue Behavior Data Driven Compliance #Predict Contract Terms/Outcomes Data Driven Transactional Work
  50. 50. #Predict Relevant Documents #Predict Case Outcomes Data Driven Legal Underwriting Data Driven EDiscovery/Due Diligence (Predictive Coding) Data Driven Compliance #Predict Contract Terms/Outcomes Data Driven Transactional Work #Predict Regulatory Outcomes Data Driven Lobbying, etc. #Predict Rogue Behavior
  51. 51. Not only law firms but also the large enterprise clients …
  52. 52. 35! “From!se)lement!informa0on!and! contracts! to! sensi0ve! client! data! and! beyond,! Liberty! Mutual! creates! and! stores! ever:growing! volumes! of! unorganized! data! across! its! worldwide! offices! and! databases.”! “I've!seen!a!real!transforma0on!in! the! legal! department! just! having! t h a t! i n f o r m a 0 o n! v i s u a l l y! available."! “The' legal' department' is' now' w o r k i n g' p r e d i c 7 v e' a n d' prescrip7ve' analy7cs,"' i.e.' ways' to' analyze' data' that' enable' forecas7ng'for'legal'issues.”'
  53. 53. 34!
  54. 54. 33! “Now! we! have! program! managers,! data! analysts,! business! analysts,! data! scien9sts,! opera9ons! managers,!I!mean,!we!have!a!ton!of! stuff.! That's! the! key! for! me,! is! thinking! about! the! right! people! doing! the! right! tasks.! That's! the! people!part.!And!then!how!they!do! them,! is! the! process,! and! then,! automa9ng! parts,! is! kind! of! that! next,!final!step.!! " And$ all$ of$ that$ is$ underpinned$ by$ d a t a ." Y o u$ c a n ' t$ d o$ a n y$ improvements$ unless$ you$ have$ data.$ You$ can't$ automate$ unless$ you$have$good$data.”!
  55. 55. 36! “I"believe"strongly"that"data"analy2cs"is" a"new"fron2er"in"the"legal"space.”" Susie!Lees! General!Counsel!! Allstate!! “Leveraging" data," not" only" that" we" possess" but" that" our" law" firms" have" amassed"over"the"years,"offers"a"plethora" of" un<tapped" opportuni=es—not" simply" to" help" us" forecast" and" manage" legal" expenses," but" also" to" help" our" clients" make"more"informed"business"decisions.”"
  56. 56. In so much as prediction is the task in question … #LegalTech #FinTech #Fin(Legal)Tech
  57. 57. “The real roll-up of all this isn’t robot lawyers, it’s financialization, with law becoming an applied branch of finance and insurance.” Daniel Martin Katz, professor, Illinois Tech’s Chicago Kent College of Law http://www.ozy.com/fast-forward/why-artificial-intelligence-might-replace-your-lawyer/75435
  58. 58. #Fin(Legal)Tech https://computationallegalstudies.com/2016/02/27/fin-legal-tech- laws-future-from-finances-past-an-expanded-version-of-the-deck/ GO HERE FOR A DETAILED TREATMENT OF THE QUESTION
  59. 59. The Three Forms of (Legal) Prediction
  60. 60. www.legalanalyticscourse.com
  61. 61. In so much as prediction is the task in question … #MachineLearing is the method du jour
  62. 62. It is not necessarily ML alone but rather some ensemble of experts, crowds + algorithms
  63. 63. http://www.sciencemag.org/news/ 2017/05/artificial-intelligence-prevails- predicting-supreme-court-decisions Professor Katz noted that in the long term …“We believe the blend of experts, crowds, and algorithms is the secret sauce for the whole thing.” May 2nd 2017
  64. 64. example from our own work
  65. 65. predicting the decisions of the Supreme Court of the United States #SCOTUS
  66. 66. Experts
  67. 67. Columbia Law Review October, 2004 Theodore W. Ruger, Pauline T. Kim, Andrew D. Martin, Kevin M. Quinn Legal and Political Science Approaches to Predicting Supreme Court Decision Making The Supreme Court Forecasting Project:
  68. 68. experts
  69. 69. Case Level Prediction Justice Level Prediction 67.4% experts 58% experts From the 68 Included Cases for the 2002-2003 Supreme Court Term
  70. 70. these experts probably overfit
  71. 71. they fit to the noise and not the signal
  72. 72. if this were finance this would be trading worse than S&P500
  73. 73. #NoiseTrading
  74. 74. #BuffetChallenge
  75. 75. like many other forms human endeavor law is full of 
 noise predictors …
  76. 76. we need to evaluate legal experts and somehow benchmark their expertise
  77. 77. from a pure forecasting standpoint
  78. 78. the best known SCOTUS predictor is
  79. 79. the law version of superforecasting
  80. 80. Crowds
  81. 81. crowds
  82. 82. https://fantasyscotus.lexpredict.com/case/list/ We can generate Crowd Sourced Predictions
  83. 83. not all members of crowd are made equal
  84. 84. we maintain a ‘supercrowd’ which is the top n of predictors up to time t-1
  85. 85. the ‘supercrowd’ outperforms the overall crowd (and even the best single player)
  86. 86. not enough crowd based decision making in institutions (law included)
  87. 87. “Software developers were asked on two separate days to estimate the completion time for a given task, the hours they projected differed by 71%, on average. W h e n p a t h o l o g i s t s m a d e t wo assessments of the severity of biopsy results, the correlation between their ratings was only .61 (out of a perfect 1.0), indicating that they made inconsistent diagnoses quite frequently. Judgments made by different people are even more likely to diverge.”
  88. 88. Brief Aside About Crowd Sourced Prediction #LegalCrowdSourcing
  89. 89. (most pundits did not identify as a serious candidate him until mid-January 2017) Neil Gorsuch was #1 o n o u r F a n t a s y Platform 12 Days after Donald Trump was elected President (i.e Nov 20)
  90. 90. #FantasySCOTUS
  91. 91. Algorithms
  92. 92. Columbia Law Review October, 2004 Theodore W. Ruger, Pauline T. Kim, Andrew D. Martin, Kevin M. Quinn Legal and Political Science Approaches to Predicting Supreme Court Decision Making The Supreme Court Forecasting Project:
  93. 93. Ruger, et al (2004) relied upon Brieman(1984) (as partially shown below)
  94. 94. Leo Brieman moved away from CART in Brieman (2001)
  95. 95. Breiman, L.(2001). Random forests. Machine learning, 45(1), 5-32. Published in Machine Learning (A Springer Science Journal)
  96. 96. One well-known problem with standard classification trees is their tendency toward overfitting
  97. 97. http://machinelearning202.pbworks.com/w/file/fetch/37597425/ performanceCompSupervisedLearning-caruana.pdf Random Forest (particularly with special config/ optimization) have proven to be unreasonably effective
  98. 98. Random forest is an approach to aggregate weak learners into collective strong learners (using a combo of bagging and random substrates) (think of it as crowd sourcing of models)
  99. 99. Our algorithm is a special version of random forest (time evolving) http://journals.plos.org/ plosone/article?id=10.1371/ journal.pone.0174698 available at RESEARCH ARTICLE A general approach for predicting the behavior of the Supreme Court of the United States Daniel Martin Katz1,2 *, Michael J. Bommarito II1,2 , Josh Blackman3 1 Illinois Tech - Chicago-Kent College of Law, Chicago, IL, United States of America, 2 CodeX - The Stanford Center for Legal Informatics, Stanford, CA, United States of America, 3 South Texas College of Law Houston, Houston, TX, United States of America * dkatz3@kentlaw.iit.edu Abstract Building on developments in machine learning and prior work in the science of judicial pre- diction, we construct a model designed to predict the behavior of the Supreme Court of the United States in a generalized, out-of-sample context. To do so, we develop a time-evolving random forest classifier that leverages unique feature engineering to predict more than 240,000 justice votes and 28,000 cases outcomes over nearly two centuries (1816-2015). Using only data available prior to decision, our model outperforms null (baseline) models at both the justice and case level under both parametric and non-parametric tests. Over nearly two centuries, we achieve 70.2% accuracy at the case outcome level and 71.9% at the jus- tice vote level. More recently, over the past century, we outperform an in-sample optimized null model by nearly 5%. Our performance is consistent with, and improves on the general level of prediction demonstrated by prior work; however, our model is distinctive because it can be applied out-of-sample to the entire past and future of the Court, not a single term. Our results represent an important advance for the science of quantitative legal prediction and portend a range of other potential applications. Introduction As the leaves begin to fall each October, the first Monday marks the beginning of another term for the Supreme Court of the United States. Each term brings with it a series of challenging, important cases that cover legal questions as diverse as tax law, freedom of speech, patent law, administrative law, equal protection, and environmental law. In many instances, the Court’s decisions are meaningful not just for the litigants per se, but for society as a whole. Unsurprisingly, predicting the behavior of the Court is one of the great pastimes for legal and political observers. Every year, newspapers, television and radio pundits, academic jour- nals, law reviews, magazines, blogs, and tweets predict how the Court will rule in a particular case. Will the Justices vote based on the political preferences of the President who appointed them or form a coalition along other dimensions? Will the Court counter expectations with an unexpected ruling? PLOS ONE | https://doi.org/10.1371/journal.pone.0174698 April 12, 2017 1 / 18 a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 OPEN ACCESS Citation: Katz DM, Bommarito MJ, II, Blackman J (2017) A general approach for predicting the behavior of the Supreme Court of the United States. PLoS ONE 12(4): e0174698. https://doi. org/10.1371/journal.pone.0174698 Editor: Luı´s A. Nunes Amaral, Northwestern University, UNITED STATES Received: January 17, 2017 Accepted: March 13, 2017 Published: April 12, 2017 Copyright: © 2017 Katz et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Data Availability Statement: Data and replication code are available on Github at the following URL: https://github.com/mjbommar/scotus-predict-v2/. Funding: The author(s) received no specific funding for this work. Competing interests: All Authors are Members of a LexPredict, LLC which provides consulting services to various legal industry stakeholders. We received no financial contributions from LexPredict or anyone else for this paper. This does not alter our adherence to PLOS ONE policies on sharing data and materials.
  100. 100. We call this a ‘general’ model of #SCOTUS Prediction available at https://arxiv.org/pdf/1612.03473
  101. 101. Not just interested in accuracy over a short time window available at https://arxiv.org/pdf/1612.03473
  102. 102. A locally tuned model will typically lead to overfitting as the dynamics shift available at https://arxiv.org/pdf/1612.03473
  103. 103. We want a model that is robust to a large number of known dynamics … available at https://arxiv.org/pdf/1612.03473
  104. 104. Version 2.02 January 16, 2017 243,882 28,009 Case Outcomes JusticeVotes Current Version of #PredictSCOTUS 1816-2015
  105. 105. Version 2.02 January 16, 2017 Current Version of #PredictSCOTUS 1816-2015 case accuracy 70.2% 71.9% justice accuracy
  106. 106. But are these results ‘good’ ?
  107. 107. What constitutes ‘good’ performance in this context?
  108. 108. We Craft Three Alternative ‘Null’ Models
  109. 109. Our Model Against the Null Models Some commentators had suggested using a heuristic rule of
 ‘always guess reverse’ as a baseline (Null Model 1 ) the always guess Reverse model Turns out it is a lousy model prior to ~1950 Because the reversal rate is not stable over time
  110. 110. Our Model Against the Null Models (Null Model 2 ) memory window = inf This is our model against Null Model 2 What about memory window that selects the most frequent historical outcome? (Green = our model out performs)
  111. 111. Our Model Against the Null Models (Null Model 3 ) finite memory window = 10 We in-sample optimize using future information to select a null model that is among the best performing of all null models as it is using in-sample info this is a deeply unfair null
  112. 112. Over past century, we outperform M=10 by nearly 5% and have significant temporal stability at both the justice, case, term level
  113. 113. Experts, Crowds, Algorithms
  114. 114. For most problems ... ensembles of these streams outperform any single stream
  115. 115. the non-trivial question is how to optimally assemble such streams for particular problems
  116. 116. Humans + Machines
  117. 117. Humans + Machines >
  118. 118. Humans + Machines Humans or Machines >
  119. 119. Here is what we are working on right now …
  120. 120. expert forecast crowd forecast learning problem is to discover how to blend streams of intelligence algorithm forecast ensemble method ensemble model
  121. 121. expert forecast crowd forecast learning problem is to discover how to blend streams of intelligence algorithm forecast ensemble method ensemble model via back testing we can learn the weights to apply for particular problems
  122. 122. By the way, you might ask why does one care about marginal improvements in prediction ? #Fin(Legal)Tech
  123. 123. Given our ability to offer forecasts of judicial outcomes, we wondered if this information could inform an event driven trading strategy ?
  124. 124. Revise + Resubmit @ http://arxiv.org/abs/1508.05751 available at http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2649726
  125. 125. In Summary …
  126. 126. The Three Forms of (Legal) Prediction The ‘Empirical Turn’ in Legal Scholarship The Diversity of Tasks that Lawyers Undertake Some Epistemological Issues / Questions The Other Type of Work That Lawyers Do Quantitative Legal Prediction
  127. 127. thelawlab.com
  128. 128. LexPredict.com
  129. 129. ComputationalLegalStudies.com BLOG
  130. 130. @ computational
  131. 131. Daniel Martin Katz @ computational computationallegalstudies.com lexpredict.com danielmartinkatz.com illinois tech - chicago kent college of law@ thelawlab.com

×