@ computationalcomputationallegalstudies.com
Predictive Coding
and E-Discovery
in 2015 and
Beyond
daniel martin katz
michael j bommarito ii
The 2x2 Machine Learning Spectrum
Info Viz & Pattern Detection
Rates of Scaling
The 2x2 Machine
Learning Spectrum
In Order to
Understand Where
We Are Heading ...
in 2015 and
Beyond ...
it is necessary to have
insight regarding how
predictive coding
actually works
Predictive Coding
Relies Upon a
Particular Class of
Machine Learning
Methods
Predictive Coding
Relies Upon a
Particular Class of
Machine Learning
Methods
The Current Approach
is drawn from
the family of so called
“supervised methods”
What is the difference
between supervised
and unsupervised?
As you have likely seen ...
Predictive Coding
Develop a Training Set
using human experts
In the simple case,
assign objects to
two piles
Take This Document Set ...
Apply Human Coders
yellow = relevant
white = non-relevant
And Return This ...
Non RelevantRelevant
Key Insight ...
What Allows A Human
To Separate These
Two Classes of
Documents?
that precise human
process is what
predictive coding is
trying to mimic
Humans are selecting
upon features of
documents
to place those
documents in their
respective bins
(i.e. relevant, non-relevant)
features =?
text,
author,
date,
other metadata
supervised methods
“learn” from the
training data
but there are different
forms of learning by
machines ...
There Is Learning
Within a Matter
(i.e. learning from a
specific training set)
But what about using
prior matters to inform
both feature selection
and the weighting of
those features
In other words, it is
possible to learn from
the experience of
having processed
documents in the past
both inside a given
company but also
across companies ...
It comes from
data aggregation / reusing data
This is Learning and
Rule Propagation
Across Matters
feedback loops are the
best friends of algorithms
feedback loops can help
make algorithms become
much smarter ...
Supervised Unsupervised
Predictive
Coding
The Future
Machine
Learning
Methods
2 x 2
Informed
Naive
Basic
Clustering
Algorithm
Supervised
Statistical models
Bayesian, e.g., Naïve Bayes Classification
Frequentist, e.g., Ordinary Least Squares
Neural Networks (NN)
Support Vector Machines (SVM)
Random Forests (RF)
Genetic Algorithms (GA)
Semi/unsupervised
Neural Networks (NN)
Clustering
K-means
Hierarchical
Radial Basis (RBF)
Graph
Some Machine Learning Algorithms
Info Viz &
Pattern
Detection
Think about the task faced
by the intelligence
community ...
mountains of
information to process
how are those
intelligence
analysts aided?
Information
Visualization
The Visual Cortex is a very
powerful CPU ...
We are very good
pattern detectors ...
We need a mix of analytics
and viz ...
because there are significant
efficiency gains to be
obtained from applications of
sophisticated data
visualization techniques
This Next Generation of
EDiscovery Software is
viz intensive ...
but this is only
the beginning ...
including an even more
enriched notion of time
dynamics ...
Rates of Scaling
Will Discovery Costs
Eventually Be Reduced?
Two Scaling Relationships
that are in question ...
Cost Per Gig
“[I]n 2001, a 300 Gb legal matter would take 200 attorneys a full
year to review, at a cost of about $15 million.
In 2003, a similar-sized matter took 100 attorneys 3 weeks to
complete, at a cost of $6 million.
And in 2006, a 300 Gb investigation took 65 attorneys only 2.5
days to complete, at a cost of $2 million.
And now, cases with several hundreds of Gbs are routine.”
Improving Document Review in E-Discovery
FTI Consulting
Past Rate
of ESI Creation
Long Term
Rate of ESI Creation ?
Daniel Martin Katz
Michigan State University
Associate Professor of Law
@ computational
computationallegalstudies.com
reinventlaw.com
http://about.me/daniel.martin.katz

Predictive Coding and E-Discovery in 2015 and Beyond - LegalTechNYC 2013 ( Daniel Martin Katz + Michael J. Bommarito II )