Using Machine Learning to Optimize DevOps Practices
Ignite Presentation
1. A FAMILIAR STORY
HERE IS A MOUNTAIN OF PAPER DOCUMENTS. . .
By: Brad Stauber, MSU College of Law
2. WHAT DO WE SEE?
• Long and Expensive
Process
• Humans get tired, upset,
hungover, etc.
• Mistakes can happen
3. MAJOR SHIFTS IN
• The Economy
• The Customer
• The Technology
• The Data
4. THE
ECONOMY
• THE GREAT RECESSION
• Individuals, Corporations, and
Law Firms
• Hurt financially
• Forced to institute belt-tightening
measures
• Tighter financing requirements
5. THE
CUSTOMER
• No longer willing to hand
out exorbitant amounts of
money for e-discovery
• Greater emphasis on:
• Better, Faster, Cheaper
• “Going Green”
• More tech savvy
6. THE
TECHNOLOGY
• Emergence of the
“Cloud”
• Increased Computing
Power
• Decrease in Data
Storage Costs
• Moore’s Law
8. THE EFFECT OF
THESE MAJOR
SHIFTS
• In The Past,
• All You Needed was Keyword
Searches and Manual Review
• Now and In Future,
• You Would Be Wise to Have
Predictive Coding and Clustering
In Your E-Discovery Arsenal
9. PREDICTIVE CODING
• Combines the Efficiencies of:
• A Computerized Sampling System, and
• A Human Expert
• Components:
• Data
• Complex Algorithms
• Software/Programs
• Human Input
• Samples and Tests
10. HOW DOES THE PROCESS WORK?
• THIS TRAINING PROCESS CAN BE
REPEATED AND THE PROGRAM CAN
CONTINUE TO LEARN.
• THUS, IMPROVING THE ACCURACY OF ITS
OUTPUT
11. WHERE HAVE WE SEEN
PREDICTIVE
ANALYTICS BEFORE?
• Google
• Translate, Spell Check,
Searches
• Netflix
• Movie Suggestions
• Pandora
• Predicts songs that the
user should like
…and the list goes on and on.
12. PROS OF PREDICTIVE CODING
• Faster than Linear Review
• Recalls a Higher Percentage of the Relevant Documents
• Higher Precision than a Human Document Reviewer
• Cost Savings
• Allows Law Firms to Do More with Less
13. CONS OF PREDICTIVE CODING
• The “smoking gun” may be missed
• Privileged documents may be produced
• Savings may not be as good as advertised
• Training Costs
• Document Review Costs
14. CLUSTERING
• The program analyzes the text of
ESI and groups related documents
together into clusters.
• Clusters can be:
• Concept-Based
• Restricted to certain keywords
• Duplicates or Near-Duplicates
• E.g., A Cluster of Bob’s Emails
to Sally
15. ADVANTAGES OF CLUSTERING
•Quickly identify major topics and
sub-topics.
•Apply tags to a single document, a
cluster of documents, or a group of
clusters
•Clustering helps to reveal document
relationships and context.
•Identify near-duplicates and process
them as a unit or individually.
•Automatic categorization.
16. CONSIDERATIONS
• Implementation Considerations:
• Software Costs
• Training Costs
• Mistakes Will Be Made
• Risk that a computer can’t find something that a
human could
• How long will this technology last?
17. WHY ARE SOME
LAWYERS STILL IN
THE STONE AGE?
• Business Model?
• More Time Used = More Billable Hours
= Greater Equity Share
• Belief That Legal Services Should
Require Human Input?
• Fear of Change?
18. CONCLUSION
• Lawyers Need to Adapt
• Lawyers Need to Conquer Their
Fear
• Predictive Coding and Clustering
allow you to deliver legal
services
• Better, Faster, and Cheaper
Because Sooner or Later,
Relying on Linear Review and Keyword
Searches WILL BE CONSIDERED……………..
Editor's Notes
Processor speeds, or overall processing power for computers will double every two years.
Predictive Coding combines the efficiencies of a computerized sampling system with a human “expert.” The human interacts with the system by making “yes/no” calls to a question against a series of controlled samples of 40 documents at a time. Questions can be “Is this document responsive?” or “Does it pertain to this specific issue?” or “Is this document privileged?”, etc.
In my research, there was a pessimistic article that mentioned these concerns.