Successfully reported this slideshow.
Your SlideShare is downloading. ×

Legal Analytics, Machine Learning and Some Comments on the Status of Innovation in the Legal Industry - Professors Daniel Martin Katz & Michael J Bommarito II - Presentation @ The Forum on Legal Evolution- NYC

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad

Check these out next

1 of 190 Ad

More Related Content

Slideshows for you (19)

Advertisement

Similar to Legal Analytics, Machine Learning and Some Comments on the Status of Innovation in the Legal Industry - Professors Daniel Martin Katz & Michael J Bommarito II - Presentation @ The Forum on Legal Evolution- NYC (20)

More from Daniel Katz (13)

Advertisement

Legal Analytics, Machine Learning and Some Comments on the Status of Innovation in the Legal Industry - Professors Daniel Martin Katz & Michael J Bommarito II - Presentation @ The Forum on Legal Evolution- NYC

  1. 1. daniel martin katz michael j bommarito adjunct professor @ university of michigan associate professor of law @ illinois tech - chicago kent co-founder @ LexPredict co-founder @ LexPredict Legal Analytics, Machine Learning and Some Comments on the Status of Innovation in the Legal Industry
  2. 2. Forum on Legal Evolution NYC
  3. 3. in the legal industry there already is better corn
  4. 4. "We always overestimate the change that will occur in the next two years and underestimate the change that will occur in the next ten” - bill gates
  5. 5. today’s focus is primarily legal analytics + process engineering © daniel martin katz michael j bommarito
  6. 6. before providing some concrete examples - some broad thoughts ... © daniel martin katz michael j bommarito
  7. 7. three faces of innovation in legal © daniel martin katz michael j bommarito
  8. 8. (1) lawyers for innovators / entrepreneurs © daniel martin katz michael j bommarito
  9. 9. © daniel martin katz michael j bommarito what most lawyers and law schools think of as “Law+Entrepreneurship" (1) lawyers for innovators / entrepreneurs
  10. 10. (2) lawyers as innovators - substance © daniel martin katz michael j bommarito
  11. 11. poison pill - “the most important innovation in corporate law since Samuel Calvin Tate Dodd invented the trust for John D. Rockefeller and Standard Oil in 1879” © daniel martin katz michael j bommarito (2) lawyers as innovators - substance
  12. 12. emerging areas - 3D Printing, Driverless Cars, Augmented Reality, Data Breach, Big Data+Privacy, etc. Drones, Internet of Things, CyberSecurity, © daniel martin katz michael j bommarito (2) lawyers as innovators - substance
  13. 13. (3) lawyers as innovators - business/process © daniel martin katz michael j bommarito
  14. 14. innovation directed toward transforming the practice of law © daniel martin katz michael j bommarito (3) lawyers as innovators - business/process
  15. 15. © daniel martin katz michael j bommarito
  16. 16. © daniel martin katz michael j bommarito there are different ways that organizations are innovating on the third face
  17. 17. {Law Substantive Legal Expertise Analytics Platform AI Computing Process Mapping User Experience Design Thinking Business Models Regulation Marketing + Tech + Design TM + Delivery} © daniel martin katz michael j bommarito
  18. 18. © daniel martin katz michael j bommarito
  19. 19. © daniel martin katz michael j bommarito some traditional law firms have been very aggressive
  20. 20. © daniel martin katz michael j bommarito
  21. 21. © daniel martin katz michael j bommarito but most of the innovation is Lex.Startup
  22. 22. © daniel martin katz michael j bommarito Lex.Startup is beginning to take hold
  23. 23. 15 2009 Lex.Startup
  24. 24. 15 2009 Lex.Startup
  25. 25. 15 425+ 2009 2014 Law or Legal Related Companies* as highlighted by Josh Kubicki @ ReInventLaw London 2013 Lex.Startup
  26. 26. © daniel martin katz michael j bommarito So what are these folks doing?
  27. 27. R + D Function in the Legal Industry © daniel martin katz michael j bommarito
  28. 28. We Could Imagine a World Where Law Firms Did the R+D for the Industry © daniel martin katz michael j bommarito
  29. 29. But That Has (Mostly) Proven Illusive © daniel martin katz michael j bommarito
  30. 30. © daniel martin katz michael j bommarito Lex.Startup is undertaking that function
  31. 31. © daniel martin katz michael j bommarito Here are the specific approaches that are being undertaken
  32. 32. © daniel martin katz michael j bommarito Some organizations are doing more than one
  33. 33. © daniel martin katz michael j bommarito labor arbitrage
  34. 34. © daniel martin katz michael j bommarito labor arbitrage process/tech arbitrage
  35. 35. © daniel martin katz michael j bommarito labor arbitrage process/tech arbitrage regulatory arbitrage
  36. 36. © daniel martin katz michael j bommarito labor arbitrage process/tech arbitrage regulatory arbitrage design as the ultimate bespoke
  37. 37. © daniel martin katz michael j bommarito labor arbitrage process/tech arbitrage regulatory arbitrage design as the ultimate bespoke predictive analytics
  38. 38. © daniel martin katz michael j bommarito could do an individual talk on any of these topics...
  39. 39. © daniel martin katz michael j bommarito labor arbitrage process/tech arbitrage regulatory arbitrage design as the ultimate bespoke predictive analytics
  40. 40. © daniel martin katz michael j bommarito labor arbitrage process/tech arbitrage regulatory arbitrage design as the ultimate bespoke predictive analytics
  41. 41. © daniel martin katz michael j bommarito
  42. 42. Predictive Analytics in Law © daniel martin katz michael j bommarito
  43. 43. © daniel martin katz michael j bommarito The Data Driven Future of the Legal Industry
  44. 44. © daniel martin katz michael j bommarito is already here...
  45. 45. © daniel martin katz michael j bommarito 2011
  46. 46. © daniel martin katz michael j bommarito 2011
  47. 47. © daniel martin katz michael j bommarito 2012
  48. 48. © daniel martin katz michael j bommarito 2013
  49. 49. © daniel martin katz michael j bommarito 2013
  50. 50. © daniel martin katz michael j bommarito 2013
  51. 51. © daniel martin katz michael j bommarito 2013
  52. 52. 2013
  53. 53. © daniel martin katz michael j bommarito 2013
  54. 54. © daniel martin katz michael j bommarito 2013
  55. 55. © daniel martin katz michael j bommarito Quantitative Legal Prediction - or - How I Learned to Stop Worrying and Start Preparing for the Data Driven Future of the Legal Services Industry Daniel Martin Katz Associate Professor of Law Michigan State University 62 Emory L. J. 909 (2013)
  56. 56. © daniel martin katz michael j bommarito Cause and Effect Quantitative Legal Prediction vs.
  57. 57. © daniel martin katz michael j bommarito Cause and Effect Quantitative Legal Prediction vs.
  58. 58. © daniel martin katz michael j bommarito Machine Learning is the heart of predictive analytics
  59. 59. Legal Analytics Professor Daniel Martin Katz Professor Michael J Bommarito II @MSU Law - Winter 2014 © daniel martin katz michael j bommarito
  60. 60. Supervised Statistical models Bayesian, e.g., Naïve Bayes Classification Frequentist, e.g., Ordinary Least Squares Neural Networks (NN) Support Vector Machines (SVM) Random Forests (RF) Genetic Algorithms (GA) Semi/Unsupervised Neural Networks (NN) Clustering K-means Hierarchical Radial Basis (RBF) Graph Some Machine Learning Methods © daniel martin katz michael j bommarito
  61. 61. http://scikit-learn.org/stable/tutorial/machine_learning_map/index.html © daniel martin katz michael j bommarito
  62. 62. classification clustering regression dimension reduction the family of machine learning methods © daniel martin katz michael j bommarito
  63. 63. Quick Example of the Methods © daniel martin katz michael j bommarito
  64. 64. © daniel martin katz michael j bommarito Adapted from Slides By Victor Lavrenko and Nigel Goddard @ University of Edinburgh Take A LookThese 12
  65. 65. © daniel martin katz michael j bommarito 72 Female Human 3 Female Horse 36 Male Human 21 Male Human 67 Male Human 29 Female Human 54 Male Human 44 Male Human 50 Male Human 42 Female Human 6 Male Dog 7 Female Human
  66. 66. © daniel martin katz michael j bommarito Classification (Supervised Learning) decision boundary female male f( ) Gender?
  67. 67. © daniel martin katz michael j bommarito Classification (Supervised Learning) decision boundary female male f( ) Gender? Regression (Supervised Learning) #f( ) Age? 723 2 3 67 54 29 42 44 50 7 6 27 44 53 3 68 2 48 10 6 743 4 4
  68. 68. © daniel martin katz michael j bommarito Classification (Supervised Learning) decision boundary female male f( ) Gender? f( ) Loan Application? Yes Multi Class Classification (Supervised Learning) No Maybe Yes Perhaps No Multiclass = Boundary Hyperplane Regression (Supervised Learning) #f( ) Age? 723 2 3 67 54 29 42 44 50 7 6 27 44 53 3 68 2 48 10 6 743 4 4
  69. 69. © daniel martin katz michael j bommarito Classification (Supervised Learning) decision boundary female male f( ) Gender? f( ) Loan Application? Yes Multi Class Classification (Supervised Learning) No Maybe Yes Perhaps No Multiclass = Boundary Hyperplane Regression (Supervised Learning) #f( ) Age? 723 2 3 67 54 29 42 44 50 7 6 27 44 53 3 68 2 48 10 6 743 4 4 Clustering (Unsupervised Learning) Clusterf( ) Group?
  70. 70. © daniel martin katz michael j bommarito Regression as a Prediction Tool
  71. 71. © daniel martin katz michael j bommarito Regression as a Prediction Tool
  72. 72. © daniel martin katz michael j bommarito Standard Linear Regression Can Be Used to Predict a Probability (using LPM, Logit, etc.)
  73. 73. © daniel martin katz michael j bommarito Standard Linear Regression Can Be Used to Predict a Quantity
  74. 74. © daniel martin katz michael j bommarito Task = Predict the Expected Cost of a Given Legal Service f( ) Cost? # and/or 010 101 001 Regression (Supervised Learning)
  75. 75. © daniel martin katz michael j bommarito http://reinventlawchannel.com/ron-gruner-were-on-a-mission/
  76. 76. © daniel martin katz michael j bommarito Y = βo +/- β1 ( X1 ) +/- β2 ( X2 ) +/- β3 ( X3 ) +/- β4 ( X3 ) +/- β5 ( X3 ) + ε Y = $151 + $15 ( ) + 161 ( ) + 95 ( ) + 34 ( ) +/- β5 ( ) + ε Per 100 Lawyers If Tier 1 Market is True Partner Status is True Per 10 Years Practice Area
  77. 77. © daniel martin katz michael j bommarito Turn Around and Use This Model To Predict Other Lawyers (also Matters, etc.)
  78. 78. © daniel martin katz michael j bommarito This Requires a Method to Deal With Changes in Dynamics, etc.
  79. 79. © daniel martin katz michael j bommarito This Requires a Method to Update the Model as Time Moves Forward
  80. 80. © daniel martin katz michael j bommarito Must Deal With Overfitting to the Existing Data
  81. 81. © daniel martin katz michael j bommarito
  82. 82. Machine Learning and the Future of E-Discovery © daniel martin katz michael j bommarito
  83. 83. imagine your client is served with a request for production © daniel martin katz michael j bommarito
  84. 84. in random order assume this is the size of the hypothetical document set (emails, memos, etc.)
  85. 85. we can sample a subset of the documents
  86. 86. we can sample a subset of the documents
  87. 87. © daniel martin katz michael j bommarito classification clustering regression dimension reduction
  88. 88. © daniel martin katz michael j bommarito classification
  89. 89. © daniel martin katz michael j bommarito
  90. 90. predictive coding = ~ binary classification © daniel martin katz michael j bommarito
  91. 91. © daniel martin katz michael j bommarito LearningTask = Determine Whether a Given Document is Relevant? Relevant Not Relevant f( ) relevance? Binary Classification (Supervised Learning) and/or 010 101 001
  92. 92. take the sample set as a training set and use human experts © daniel martin katz michael j bommarito
  93. 93. the use of the human experts is called “supervised learning” © daniel martin katz michael j bommarito
  94. 94. in the simple binary case, ask humans to assign objects to two piles © daniel martin katz michael j bommarito
  95. 95. Apply Human Coders © daniel martin katz michael j bommarito
  96. 96. yellow = relevant white = non-relevant and return this © daniel martin katz michael j bommarito
  97. 97. Non RelevantRelevant © daniel martin katz michael j bommarito
  98. 98. Key Insight ... © daniel martin katz michael j bommarito
  99. 99. What Allows A Human To Separate These Two Classes of Documents? © daniel martin katz michael j bommarito
  100. 100. that precise human process is what “predictive coding” is trying to mimic © daniel martin katz michael j bommarito
  101. 101. most vendors are selling a largely undifferentiated product © daniel martin katz michael j bommarito
  102. 102. Humans are selecting upon some “features” of the documents © daniel martin katz michael j bommarito
  103. 103. to place those documents in their respective bins
 (i.e. relevant, non-relevant) © daniel martin katz michael j bommarito
  104. 104. features =? text, author, date, other metadata © daniel martin katz michael j bommarito
  105. 105. machine learning task is trying to recover (learn) what separates the relevant from the non-relevant documents © daniel martin katz michael j bommarito
  106. 106. once we learn the rule / boundary we can apply it to separate the remain documents into the two classes © daniel martin katz michael j bommarito
  107. 107. © daniel martin katz michael j bommarito we want to take what we learn here
  108. 108. © daniel martin katz michael j bommarito we want to take what we learn here
  109. 109. © daniel martin katz michael j bommarito we want to take what we learn here and apply it here
  110. 110. © daniel martin katz michael j bommarito
  111. 111. the future of e-discovery will follow the arc of machine learning © daniel martin katz michael j bommarito
  112. 112. Supervised Unsupervised Predictive Coding (Classification) The Long Term Future Machine Learning Methods 2 x 2 Informed Naive Basic Clustering Algorithm © daniel martin katz michael j bommarito
  113. 113. there are different forms of learning by machines ... © daniel martin katz michael j bommarito
  114. 114. There Is Learning Within a Matter (i.e. learning from a specific training set) © daniel martin katz michael j bommarito
  115. 115. In other words, it is possible for the machine to learn from the experience of having processed documents in the past © daniel martin katz michael j bommarito
  116. 116. both inside a given company but also across companies ... © daniel martin katz michael j bommarito
  117. 117. this is how data aggregation / reusing data becomes very powerful © daniel martin katz michael j bommarito
  118. 118. data aggregation / reusing data make the naive into the informed © daniel martin katz michael j bommarito
  119. 119. data aggregation / reusing data help move from the supervised to the semi/unsupervised © daniel martin katz michael j bommarito
  120. 120. Supervised Unsupervised Predictive Coding (Classification) The Future Machine Learning Methods 2 x 2 Informed Naive Basic Clustering Algorithm © daniel martin katz michael j bommarito
  121. 121. © daniel martin katz michael j bommarito
  122. 122. Machine Learning Natural Language Processing and Due Diligence © daniel martin katz michael j bommarito
  123. 123. © daniel martin katz michael j bommarito
  124. 124. © daniel martin katz michael j bommarito The system comes pre-trained 
 for provisions including: Title, Parties, Date, Term, Change of Control, Assignment, Indemnity, Confidentiality, Governing Law, License Grant, Bankruptcy, Notice, Amendment, Non-Solicit, and more.
  125. 125. Based on testing, we know our system finds 90% or more of the instances of nearly every substantive provision it covers. This 90% number is our system’s recall; its precision differs by provision by provision but is consistently very manageable. © daniel martin katz michael j bommarito
  126. 126. We are able to build custom provisions on request. Thanks to our highly customized training algorithms, this process is easy and relatively automated. We are also engaged in adding more provisions. © daniel martin katz michael j bommarito
  127. 127. © daniel martin katz michael j bommarito
  128. 128. Machine Learning and Judicial Behavior © daniel martin katz michael j bommarito
  129. 129. © daniel martin katz michael j bommarito
  130. 130. © daniel martin katz michael j bommarito
  131. 131. © daniel martin katz michael j bommarito
  132. 132. © daniel martin katz michael j bommarito
  133. 133. 2002 Prediction Tourney and its limits © daniel martin katz michael j bommarito
  134. 134. © daniel martin katz michael j bommarito Model Leverages Classification Tree (Tool from Machine Learning)
  135. 135. Standard Decision Tree Often Not Generalizable, Often Overfits the Data © daniel martin katz michael j bommarito
  136. 136. Need a more complex approach © daniel martin katz michael j bommarito
  137. 137. Predicting the Behavior of the United States Supreme Court: A General Approach © daniel martin katz michael j bommarito Black Reed Frankfurter Douglas Jackson Burton Clark Minton Warren Harlan Brennan Whittaker Stewart White Goldberg Fortas Marshall Burger Blackmun Powell Rehnquist Stevens OConnor Scalia Kennedy Souter Thomas Ginsburg Breyer Roberts Alito Sotomayor Kagan 1953 1963 1973 1983 1993 2003 2013 9-0 Reverse 8-1, 7-2, 6-3 19 19 19 19 19 20 20 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 - Reverse 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 - 8-1, 7-2, 6-3 9-0 19 19 19 19 19 20 20
  138. 138. feature engineering © daniel martin katz michael j bommarito The real world gives us raw material, at best.  Typically, you even have to dig the stuff raw material out of your own unstructured data
  139. 139. similar approach can be applied to other problems © daniel martin katz michael j bommarito
  140. 140. © daniel martin katz michael j bommarito
  141. 141. © daniel martin katz michael j bommarito Case Prediction and Litigation Data
  142. 142. © daniel martin katz michael j bommarito
  143. 143. © daniel martin katz michael j bommarito
  144. 144. © daniel martin katz michael j bommarito
  145. 145. © daniel martin katz michael j bommarito “John Dragseth, a principal at Fish & Richardson (the most active IP litigation firm in the United States, according to Corporate Counsel magazine), credits Lex Machina’s database with helping him spot meaningful but otherwise hidden trends in IP litigation—and he won’t give details. “If you published it, then people on the other side would know,” he says.
  146. 146. © daniel martin katz michael j bommarito Notice there is an offloading of data but it is up to the end user to derive meaning
  147. 147. © daniel martin katz michael j bommarito In general, the relevant consumer market is not yet mature when it comes to data science
  148. 148. © daniel martin katz michael j bommarito Difficult to sell machine learning technology in instances where the end user does not have the right assets in place
  149. 149. © daniel martin katz michael j bommarito
  150. 150. © daniel martin katz michael j bommarito Many other examples ... just starting to come online
  151. 151. © daniel martin katz michael j bommarito Attorney Quality and Performance
  152. 152. © daniel martin katz michael j bommarito Leveraging Public Data for Legal Insight
  153. 153. © daniel martin katz michael j bommarito
  154. 154. © daniel martin katz michael j bommarito
  155. 155. Change Management is the Hardest Innovation of All © daniel martin katz michael j bommarito
  156. 156. © daniel martin katz michael j bommarito Bulls and Bears ~1984 - 2009 ~2009 - 2014
  157. 157. © daniel martin katz michael j bommarito 53 in 2009 58 in 2014 If you were 28 in 1984 than you were
  158. 158. © daniel martin katz michael j bommarito before 2009 most of the individuals in the profession have only known the bull market
  159. 159. © daniel martin katz michael j bommarito it is a bear market now ... and in a bear market you need a serious strategy
  160. 160. © daniel martin katz michael j bommarito analytics/data should be part of that strategy
  161. 161. © daniel martin katz michael j bommarito “data is the oil of the 21st Century”
  162. 162. So lets be wildcatters
  163. 163. © daniel martin katz michael j bommarito law < > finance many elements in law look like finance did 25 years ago
  164. 164. © daniel martin katz michael j bommarito
  165. 165. © daniel martin katz michael j bommarito When it comes to innovation at the level that is going to be needed ...
  166. 166. © daniel martin katz michael j bommarito Assigning a innovation partner or an innovation committee is probably not enough
  167. 167. © daniel martin katz michael j bommarito Shunk Works
  168. 168. © daniel martin katz michael j bommarito how many organizations have a full time data scientist (data science team)?
  169. 169. © daniel martin katz michael j bommarito need a full scale and empowered R+D team (data science, tech, etc.)
  170. 170. © daniel martin katz michael j bommarito Final Thought
  171. 171. Exit, Voice & Loyalty © daniel martin katz michael j bommarito
  172. 172. daniel martin katz michael j bommarito ii adjunct professor of Law @ michigan state university associate professor of law @ illinois tech - chicago kent co-founder @ LexPredict director of research @ reInventLaw laboratory co-founder @ LexPredict Forum on Legal Evolution NYC

×