SlideShare a Scribd company logo
An Inevitable Reality: Machine-based eDiscovery Review Jason R. Baron, Esq.Director of Litigation, National Archives and Records Administration jason.baron@nara.gov James D. Shook, Esq. Director, eDiscovery and Compliance Group, EMC Corporation jim.shook@emc.com
The Problem Technologies and Techniques The "Unfolding Law“ & Current Research Question & Answer
1.8Zb Lots of It 95% Mostly Unstructured 85% Mostly Unmanaged 85% Created by Organizations ▲ Becoming More Regulated Information Today – The Big Picture Information
A Legal Crossroads “[T]he legal profession is at a crossroads: the choice is between continuing to conduct discovery as it has ‘always been practiced’ in a paper world – before the advent of computers [and] the Internet . . . Or, alternatively, embracing new ways of thinking in today’s digital world.” The Sedona Conference, The Sedona Conference Commentary on Achieving Quality in the E-Discovery Process (2009)
Search v. Review
A Better Mousetrap ,[object Object],	- 10Mdocs (25% attachments) 	- Review50 per hour   	- 100 people, 10 hrs per day, 7 days a week, 52 weeks a year …. ,[object Object],  - 28 weeks  . . .  -   - $20 million in cost
FINDING RESPONSIVE DOCUMENTS IN A LARGE DATA SET: FOUR LOGICAL CATEGORIES  Not Relevant and Retrieved Relevant and Retrieved DOCUMENT SET  FALSE POSITIVES Relevant and Not Retrieved Not Relevant and Not Retrieved FALSE NEGATIVES
The Problem Technologies and Techniques  The "Unfolding Law“ and Current Research Question & Answer
Techniques Advanced Search Greater Interaction with Opposing Counsel Iterative, tiered and phased approach Project Management, Sampling, Quality Control Jason R. Baron, Law in the Age of Exabytes: Some Further Thoughts on ‘Information Inflation’ and Current Issues in e-Discovery Search, XVII Rich. J.L. & Tech. 9 (2011), http://jolt.richmond.edu/v17i3/article9.pdf
10 Technology Tools Greater Use Made of Boolean Strings Fuzzy Search Models Probabilistic models (Bayesian) Statistical methods (clustering) Machine learning approaches to semantic representation Categorization tools: taxonomies and ontologies Social network analysis Hybrid approaches Reference:  Appendix to The Sedona Conference® Best Practices Commentary on the Use of Search and Information Retrieval Methods in E-Discovery (2007), available at http://www.thesedonaconference.org  (link to publications)
Emerging New Predictive Strategies Improved review and case assessment: cluster docs thru use of software with minimal human intervention at front end to code “seeded” data set Slide adapted from Gartner Conference June 23, 2010 Washington, D.C.
The Problem Technologies and Techniques  The “Unfolding Law” and Current Research Question & Answer
Unfolding Law Fed. Rule Civ. P. 1 (aim is to secure the just, speedy, economical determination of every action) U.S. v. O’Keefe  Victor Stanley I Privilege Concerns
Judge Facciola writing for the U.S. District Court for the District of Columbia 14     “Whether search terms or ‘keywords’ will yield the information sought is a complicated question involving the interplay, at least, of the sciences of computer technology, statistics and linguistics. See George L. Paul & Jason R. BaronInformation Inflation: Can the Legal System Adapt?', 13 RICH. J.L. & TECH.. 10 (2007) *  *  *  Given this complexity, for lawyers and judges to dare opine that a certain search term or terms would be more likely to produce information than the terms that were used is truly to go where angels fear to tread.” 	-- U.S. v. O'Keefe,  537 F.Supp.2d 14, 24 D.D.C. 2008).
Judge Grimm writing for the U.S. District Court for the District of Maryland 15     “[W]hile it is universally acknowledged that keyword searches are useful tools for search and retrieval of ESI, all keyword searches are not created equal; and there is a growing body of literature that highlights the risks associated with conducting an unreliable or inadequate keyword search or relying on such searches for privilege review.”  Victor Stanley, Inc. v. Creative Pipe, Inc., 250 F.R.D. 251 (D. Md. 2008); see id., text accompanying nn. 9 & 10 (citing to Sedona Search Commentary & TREC Legal Track research project)
What is TREC? 16 Conference series co-sponsored by the National Institute of Standards and Technology (NIST) and the Advanced Research and Development Activity (ARDA) of the Department of Defense Designed to promote research into the science of information retrieval First TREC conference was in 1992 15th Conference held November 15-17, 2006 in U.S. in Gaithersburg, Maryland (NIST headquarters)
17 TREC Legal Track The TREC Legal Track was designed to evaluate the effectiveness of search technologies in a real-world legal context First of a kind study using nonproprietary data since Blair/Maron research in 1985  Hypothetical complaints and 100+ “requests to produce” drafted by members of The Sedona Conference® “Boolean negotiations” conducted as a baseline for search efforts  Documents to be searched were drawn from a publicly available 7 million document tobacco litigation Master Settlement Agreement database New Interactive Task added in 2008 and continued in 2009 using Topic Authorities and a post-adjudication round In 2009, a second Enron data set was added as a separate task Participating teams of information scientists from around the world contributing computer runs, plus in 2008 thru 2011 from legal service providers  Results from 2010 round currently being processed – will be posted on TREC website soon
Recall & Precision Team A ,[object Object]
 Precision = 7.7%
 F1 = 12.3%,[object Object]
 Precision = 16.9%
 F1 = 18.3%,[object Object]
 Precision = 84.4%
 F1 = 80.1%,[object Object]
“Boolean” Searches May Miss A Large Percentage of Relevant Documents 78% of relevant documents were only found by some other technique Source: TREC 2007 Legal Track
Interactive Task – Results from 2008 & 2009 Topic 102 (2008) Topic 103 (2008) Topic 104 (2008) Topic 201 (2009) Topic 202 (2009) Topic 203 (2009) Topic 204 (2009) Topic 205 (2009) Topic 206 (2009) Topic 207 (2009) Source: 2008/2009 TREC Legal Track

More Related Content

What's hot

The Commons: Leveraging the Power of the Cloud for Big Data
The Commons: Leveraging the Power of the Cloud for Big DataThe Commons: Leveraging the Power of the Cloud for Big Data
The Commons: Leveraging the Power of the Cloud for Big Data
Philip Bourne
 
Open Data in a Global Ecosystem
Open Data in a Global EcosystemOpen Data in a Global Ecosystem
Open Data in a Global Ecosystem
Philip Bourne
 
From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent S...
From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent S...From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent S...
From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent S...
Edward Curry
 
Big Data in Biomedicine – An NIH Perspective
Big Data in Biomedicine – An NIH PerspectiveBig Data in Biomedicine – An NIH Perspective
Big Data in Biomedicine – An NIH Perspective
Philip Bourne
 
Science20brussels osimo april2013
Science20brussels osimo april2013Science20brussels osimo april2013
Science20brussels osimo april2013
osimod
 
Nicole Nogoy at the Auckland BMC RoadShow
Nicole Nogoy at the Auckland BMC RoadShowNicole Nogoy at the Auckland BMC RoadShow
Nicole Nogoy at the Auckland BMC RoadShow
GigaScience, BGI Hong Kong
 
Tech essentials
Tech essentialsTech essentials
eResearch New Zealand Keynote
eResearch New Zealand KeynoteeResearch New Zealand Keynote
eResearch New Zealand Keynote
University of Washington
 
Talking 'bout a revolution: Framing e-Research as a computerization movement
Talking 'bout a revolution: Framing e-Research as a computerization movementTalking 'bout a revolution: Framing e-Research as a computerization movement
Talking 'bout a revolution: Framing e-Research as a computerization movement
Eric Meyer
 
Stanford DeepDive Framework
Stanford DeepDive FrameworkStanford DeepDive Framework
Stanford DeepDive Framework
Ran Zhang
 
Massive-Scale Analytics Applied to Real-World Problems
Massive-Scale Analytics Applied to Real-World ProblemsMassive-Scale Analytics Applied to Real-World Problems
Massive-Scale Analytics Applied to Real-World Problems
inside-BigData.com
 
Niso library law
Niso library lawNiso library law
Niso library law
Micah Altman
 
Knoesis Student Achievement
Knoesis Student AchievementKnoesis Student Achievement
Knoesis Student Achievement
Artificial Intelligence Institute at UofSC
 
SemWeb 4 Gov – opportunities and challenges
SemWeb 4 Gov – opportunities and challengesSemWeb 4 Gov – opportunities and challenges
SemWeb 4 Gov – opportunities and challenges
Andrew Woolf
 
wireless sensor network
wireless sensor networkwireless sensor network
wireless sensor network
parry prabhu
 
Data, Responsibly: The Next Decade of Data Science
Data, Responsibly: The Next Decade of Data ScienceData, Responsibly: The Next Decade of Data Science
Data, Responsibly: The Next Decade of Data Science
University of Washington
 
Presentation of science 2.0 at European Astronomical Society
Presentation of science 2.0 at European Astronomical SocietyPresentation of science 2.0 at European Astronomical Society
Presentation of science 2.0 at European Astronomical Society
osimod
 
Data Sets as Facilitator for new Products and Services for Universities
Data Sets as Facilitator for new Products and Services for UniversitiesData Sets as Facilitator for new Products and Services for Universities
Data Sets as Facilitator for new Products and Services for Universities
Hendrik Drachsler
 
There is No Intelligent Life Down Here
There is No Intelligent Life Down HereThere is No Intelligent Life Down Here
There is No Intelligent Life Down Here
Philip Bourne
 
hariri2019.pdf
hariri2019.pdfhariri2019.pdf
hariri2019.pdf
Akuhuruf
 

What's hot (20)

The Commons: Leveraging the Power of the Cloud for Big Data
The Commons: Leveraging the Power of the Cloud for Big DataThe Commons: Leveraging the Power of the Cloud for Big Data
The Commons: Leveraging the Power of the Cloud for Big Data
 
Open Data in a Global Ecosystem
Open Data in a Global EcosystemOpen Data in a Global Ecosystem
Open Data in a Global Ecosystem
 
From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent S...
From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent S...From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent S...
From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent S...
 
Big Data in Biomedicine – An NIH Perspective
Big Data in Biomedicine – An NIH PerspectiveBig Data in Biomedicine – An NIH Perspective
Big Data in Biomedicine – An NIH Perspective
 
Science20brussels osimo april2013
Science20brussels osimo april2013Science20brussels osimo april2013
Science20brussels osimo april2013
 
Nicole Nogoy at the Auckland BMC RoadShow
Nicole Nogoy at the Auckland BMC RoadShowNicole Nogoy at the Auckland BMC RoadShow
Nicole Nogoy at the Auckland BMC RoadShow
 
Tech essentials
Tech essentialsTech essentials
Tech essentials
 
eResearch New Zealand Keynote
eResearch New Zealand KeynoteeResearch New Zealand Keynote
eResearch New Zealand Keynote
 
Talking 'bout a revolution: Framing e-Research as a computerization movement
Talking 'bout a revolution: Framing e-Research as a computerization movementTalking 'bout a revolution: Framing e-Research as a computerization movement
Talking 'bout a revolution: Framing e-Research as a computerization movement
 
Stanford DeepDive Framework
Stanford DeepDive FrameworkStanford DeepDive Framework
Stanford DeepDive Framework
 
Massive-Scale Analytics Applied to Real-World Problems
Massive-Scale Analytics Applied to Real-World ProblemsMassive-Scale Analytics Applied to Real-World Problems
Massive-Scale Analytics Applied to Real-World Problems
 
Niso library law
Niso library lawNiso library law
Niso library law
 
Knoesis Student Achievement
Knoesis Student AchievementKnoesis Student Achievement
Knoesis Student Achievement
 
SemWeb 4 Gov – opportunities and challenges
SemWeb 4 Gov – opportunities and challengesSemWeb 4 Gov – opportunities and challenges
SemWeb 4 Gov – opportunities and challenges
 
wireless sensor network
wireless sensor networkwireless sensor network
wireless sensor network
 
Data, Responsibly: The Next Decade of Data Science
Data, Responsibly: The Next Decade of Data ScienceData, Responsibly: The Next Decade of Data Science
Data, Responsibly: The Next Decade of Data Science
 
Presentation of science 2.0 at European Astronomical Society
Presentation of science 2.0 at European Astronomical SocietyPresentation of science 2.0 at European Astronomical Society
Presentation of science 2.0 at European Astronomical Society
 
Data Sets as Facilitator for new Products and Services for Universities
Data Sets as Facilitator for new Products and Services for UniversitiesData Sets as Facilitator for new Products and Services for Universities
Data Sets as Facilitator for new Products and Services for Universities
 
There is No Intelligent Life Down Here
There is No Intelligent Life Down HereThere is No Intelligent Life Down Here
There is No Intelligent Life Down Here
 
hariri2019.pdf
hariri2019.pdfhariri2019.pdf
hariri2019.pdf
 

Similar to Jason Baron, Esq. and James Shook, Esq. - An Inevitable Reality: Machine-based eDiscovery Review

EDI 2009 Case Law Update
EDI 2009 Case Law UpdateEDI 2009 Case Law Update
Computer Assisted Review and Reasonable Solutions under Rule26
Computer Assisted Review and Reasonable Solutions under Rule26Computer Assisted Review and Reasonable Solutions under Rule26
Computer Assisted Review and Reasonable Solutions under Rule26
Michael Geske
 
DOJ
DOJDOJ
10probs.ppt
10probs.ppt10probs.ppt
10probs.ppt
ssuser483c2c
 
Computer ForensicsDiscussion 1Forensics Certifications Ple.docx
Computer ForensicsDiscussion 1Forensics Certifications Ple.docxComputer ForensicsDiscussion 1Forensics Certifications Ple.docx
Computer ForensicsDiscussion 1Forensics Certifications Ple.docx
donnajames55
 
Search Angels
Search AngelsSearch Angels
Search Angels
OrcaTec LLC
 
Malware analysis
Malware analysisMalware analysis
Malware analysis
Anne ndolo
 
Oracle openworld-presentation
Oracle openworld-presentationOracle openworld-presentation
Oracle openworld-presentation
Dr. Neil Brittliff
 
cyber law and forensics,biometrics systems
cyber law and forensics,biometrics systemscyber law and forensics,biometrics systems
cyber law and forensics,biometrics systems
Mayank Diwakar
 
Review of Previous ETAP Forums - Deepak Maheshwari
Review of Previous ETAP Forums - Deepak MaheshwariReview of Previous ETAP Forums - Deepak Maheshwari
Review of Previous ETAP Forums - Deepak Maheshwari
vpnmentor
 
Theory Generation for Security Protocols
Theory Generation for Security ProtocolsTheory Generation for Security Protocols
Theory Generation for Security Protocols
butest
 
Computer forencis
Computer forencisComputer forencis
Computer forencis
Teja Bheemanapally
 
1- How could you approach a task to create a stakeholder’s managem.docx
1- How could you approach a task to create a stakeholder’s managem.docx1- How could you approach a task to create a stakeholder’s managem.docx
1- How could you approach a task to create a stakeholder’s managem.docx
aulasnilda
 
01 computer%20 forensics%20in%20todays%20world
01 computer%20 forensics%20in%20todays%20world01 computer%20 forensics%20in%20todays%20world
01 computer%20 forensics%20in%20todays%20world
Aqib Memon
 
1803.09010.pdf
1803.09010.pdf1803.09010.pdf
1803.09010.pdf
jadenwu39
 
Artificial intelligence in law: the state of play 2016
Artificial intelligence in law: the state of play 2016Artificial intelligence in law: the state of play 2016
Artificial intelligence in law: the state of play 2016
Kyiv National Economic University
 
Whitt a deference to protocol revised journal draft december 2012 120612
Whitt a deference to protocol revised journal draft december 2012 120612Whitt a deference to protocol revised journal draft december 2012 120612
Whitt a deference to protocol revised journal draft december 2012 120612
rswhitt1
 
DF Process Models
DF Process ModelsDF Process Models
DF Process Models
Costas Katsavounidis
 
Running head CRIME ANALYSIS TECHNOLOGY .docx
Running head CRIME ANALYSIS TECHNOLOGY                           .docxRunning head CRIME ANALYSIS TECHNOLOGY                           .docx
Running head CRIME ANALYSIS TECHNOLOGY .docx
healdkathaleen
 
Running head CRIME ANALYSIS TECHNOLOGY .docx
Running head CRIME ANALYSIS TECHNOLOGY                           .docxRunning head CRIME ANALYSIS TECHNOLOGY                           .docx
Running head CRIME ANALYSIS TECHNOLOGY .docx
todd271
 

Similar to Jason Baron, Esq. and James Shook, Esq. - An Inevitable Reality: Machine-based eDiscovery Review (20)

EDI 2009 Case Law Update
EDI 2009 Case Law UpdateEDI 2009 Case Law Update
EDI 2009 Case Law Update
 
Computer Assisted Review and Reasonable Solutions under Rule26
Computer Assisted Review and Reasonable Solutions under Rule26Computer Assisted Review and Reasonable Solutions under Rule26
Computer Assisted Review and Reasonable Solutions under Rule26
 
DOJ
DOJDOJ
DOJ
 
10probs.ppt
10probs.ppt10probs.ppt
10probs.ppt
 
Computer ForensicsDiscussion 1Forensics Certifications Ple.docx
Computer ForensicsDiscussion 1Forensics Certifications Ple.docxComputer ForensicsDiscussion 1Forensics Certifications Ple.docx
Computer ForensicsDiscussion 1Forensics Certifications Ple.docx
 
Search Angels
Search AngelsSearch Angels
Search Angels
 
Malware analysis
Malware analysisMalware analysis
Malware analysis
 
Oracle openworld-presentation
Oracle openworld-presentationOracle openworld-presentation
Oracle openworld-presentation
 
cyber law and forensics,biometrics systems
cyber law and forensics,biometrics systemscyber law and forensics,biometrics systems
cyber law and forensics,biometrics systems
 
Review of Previous ETAP Forums - Deepak Maheshwari
Review of Previous ETAP Forums - Deepak MaheshwariReview of Previous ETAP Forums - Deepak Maheshwari
Review of Previous ETAP Forums - Deepak Maheshwari
 
Theory Generation for Security Protocols
Theory Generation for Security ProtocolsTheory Generation for Security Protocols
Theory Generation for Security Protocols
 
Computer forencis
Computer forencisComputer forencis
Computer forencis
 
1- How could you approach a task to create a stakeholder’s managem.docx
1- How could you approach a task to create a stakeholder’s managem.docx1- How could you approach a task to create a stakeholder’s managem.docx
1- How could you approach a task to create a stakeholder’s managem.docx
 
01 computer%20 forensics%20in%20todays%20world
01 computer%20 forensics%20in%20todays%20world01 computer%20 forensics%20in%20todays%20world
01 computer%20 forensics%20in%20todays%20world
 
1803.09010.pdf
1803.09010.pdf1803.09010.pdf
1803.09010.pdf
 
Artificial intelligence in law: the state of play 2016
Artificial intelligence in law: the state of play 2016Artificial intelligence in law: the state of play 2016
Artificial intelligence in law: the state of play 2016
 
Whitt a deference to protocol revised journal draft december 2012 120612
Whitt a deference to protocol revised journal draft december 2012 120612Whitt a deference to protocol revised journal draft december 2012 120612
Whitt a deference to protocol revised journal draft december 2012 120612
 
DF Process Models
DF Process ModelsDF Process Models
DF Process Models
 
Running head CRIME ANALYSIS TECHNOLOGY .docx
Running head CRIME ANALYSIS TECHNOLOGY                           .docxRunning head CRIME ANALYSIS TECHNOLOGY                           .docx
Running head CRIME ANALYSIS TECHNOLOGY .docx
 
Running head CRIME ANALYSIS TECHNOLOGY .docx
Running head CRIME ANALYSIS TECHNOLOGY                           .docxRunning head CRIME ANALYSIS TECHNOLOGY                           .docx
Running head CRIME ANALYSIS TECHNOLOGY .docx
 

More from J. David Morris

Gamification: Leveraging Game Strategies & Big Data to Drive Business with Dr...
Gamification: Leveraging Game Strategies & Big Data to Drive Business with Dr...Gamification: Leveraging Game Strategies & Big Data to Drive Business with Dr...
Gamification: Leveraging Game Strategies & Big Data to Drive Business with Dr...
J. David Morris
 
Dr. Bob Hayes Big Data and the Total Customer Experience
Dr. Bob Hayes Big Data and the Total Customer ExperienceDr. Bob Hayes Big Data and the Total Customer Experience
Dr. Bob Hayes Big Data and the Total Customer Experience
J. David Morris
 
Wayne Eckerson: Secrets of Analytical Leaders webinar
Wayne Eckerson: Secrets of Analytical Leaders webinarWayne Eckerson: Secrets of Analytical Leaders webinar
Wayne Eckerson: Secrets of Analytical Leaders webinar
J. David Morris
 
Laura Madsen Healthcare Webinar - Big Answers
Laura Madsen Healthcare Webinar - Big AnswersLaura Madsen Healthcare Webinar - Big Answers
Laura Madsen Healthcare Webinar - Big Answers
J. David Morris
 
Cetas Analytics as a Service for Predictive Analytics
Cetas Analytics as a Service for Predictive AnalyticsCetas Analytics as a Service for Predictive Analytics
Cetas Analytics as a Service for Predictive Analytics
J. David Morris
 
KMWorld Presentation
KMWorld PresentationKMWorld Presentation
KMWorld Presentation
J. David Morris
 
eDiscovery Infographic
eDiscovery InfographiceDiscovery Infographic
eDiscovery Infographic
J. David Morris
 
IQPC eDiscovery Goverment - Washington D.C.
IQPC eDiscovery Goverment - Washington D.C.IQPC eDiscovery Goverment - Washington D.C.
IQPC eDiscovery Goverment - Washington D.C.
J. David Morris
 
The Catch 22 of Cross Border eDiscovery
The Catch 22 of Cross Border eDiscoveryThe Catch 22 of Cross Border eDiscovery
The Catch 22 of Cross Border eDiscovery
J. David Morris
 
Overcoming In-house Politics to Implement eDiscovery
Overcoming In-house Politics to Implement eDiscoveryOvercoming In-house Politics to Implement eDiscovery
Overcoming In-house Politics to Implement eDiscovery
J. David Morris
 
Esoteric ESI eDiscovery webinar
Esoteric ESI eDiscovery webinarEsoteric ESI eDiscovery webinar
Esoteric ESI eDiscovery webinar
J. David Morris
 
IQPC NY Financial Conference on eDiscovery: Legal Speaks Greek and IT Speaks ...
IQPC NY Financial Conference on eDiscovery: Legal Speaks Greek and IT Speaks ...IQPC NY Financial Conference on eDiscovery: Legal Speaks Greek and IT Speaks ...
IQPC NY Financial Conference on eDiscovery: Legal Speaks Greek and IT Speaks ...
J. David Morris
 
It takes a village - LegalTech NY 2011
It takes a village - LegalTech NY 2011It takes a village - LegalTech NY 2011
It takes a village - LegalTech NY 2011
J. David Morris
 
eDiscovery and Records Oh...My!
eDiscovery and Records Oh...My!eDiscovery and Records Oh...My!
eDiscovery and Records Oh...My!
J. David Morris
 
LegalTech Cross Border Disputes
LegalTech Cross Border DisputesLegalTech Cross Border Disputes
LegalTech Cross Border Disputes
J. David Morris
 
Pardon the eDiscovery
Pardon the eDiscoveryPardon the eDiscovery
Pardon the eDiscovery
J. David Morris
 
EMC SourceOne for SharePoint
EMC SourceOne for SharePointEMC SourceOne for SharePoint
EMC SourceOne for SharePoint
J. David Morris
 
eDiscovery Turf Wars at LegalTech 2011
eDiscovery Turf Wars at LegalTech 2011eDiscovery Turf Wars at LegalTech 2011
eDiscovery Turf Wars at LegalTech 2011
J. David Morris
 
Apps Preso
Apps PresoApps Preso
Apps Preso
J. David Morris
 
Mc Carterand English 06152010 F
Mc Carterand English 06152010 FMc Carterand English 06152010 F
Mc Carterand English 06152010 F
J. David Morris
 

More from J. David Morris (20)

Gamification: Leveraging Game Strategies & Big Data to Drive Business with Dr...
Gamification: Leveraging Game Strategies & Big Data to Drive Business with Dr...Gamification: Leveraging Game Strategies & Big Data to Drive Business with Dr...
Gamification: Leveraging Game Strategies & Big Data to Drive Business with Dr...
 
Dr. Bob Hayes Big Data and the Total Customer Experience
Dr. Bob Hayes Big Data and the Total Customer ExperienceDr. Bob Hayes Big Data and the Total Customer Experience
Dr. Bob Hayes Big Data and the Total Customer Experience
 
Wayne Eckerson: Secrets of Analytical Leaders webinar
Wayne Eckerson: Secrets of Analytical Leaders webinarWayne Eckerson: Secrets of Analytical Leaders webinar
Wayne Eckerson: Secrets of Analytical Leaders webinar
 
Laura Madsen Healthcare Webinar - Big Answers
Laura Madsen Healthcare Webinar - Big AnswersLaura Madsen Healthcare Webinar - Big Answers
Laura Madsen Healthcare Webinar - Big Answers
 
Cetas Analytics as a Service for Predictive Analytics
Cetas Analytics as a Service for Predictive AnalyticsCetas Analytics as a Service for Predictive Analytics
Cetas Analytics as a Service for Predictive Analytics
 
KMWorld Presentation
KMWorld PresentationKMWorld Presentation
KMWorld Presentation
 
eDiscovery Infographic
eDiscovery InfographiceDiscovery Infographic
eDiscovery Infographic
 
IQPC eDiscovery Goverment - Washington D.C.
IQPC eDiscovery Goverment - Washington D.C.IQPC eDiscovery Goverment - Washington D.C.
IQPC eDiscovery Goverment - Washington D.C.
 
The Catch 22 of Cross Border eDiscovery
The Catch 22 of Cross Border eDiscoveryThe Catch 22 of Cross Border eDiscovery
The Catch 22 of Cross Border eDiscovery
 
Overcoming In-house Politics to Implement eDiscovery
Overcoming In-house Politics to Implement eDiscoveryOvercoming In-house Politics to Implement eDiscovery
Overcoming In-house Politics to Implement eDiscovery
 
Esoteric ESI eDiscovery webinar
Esoteric ESI eDiscovery webinarEsoteric ESI eDiscovery webinar
Esoteric ESI eDiscovery webinar
 
IQPC NY Financial Conference on eDiscovery: Legal Speaks Greek and IT Speaks ...
IQPC NY Financial Conference on eDiscovery: Legal Speaks Greek and IT Speaks ...IQPC NY Financial Conference on eDiscovery: Legal Speaks Greek and IT Speaks ...
IQPC NY Financial Conference on eDiscovery: Legal Speaks Greek and IT Speaks ...
 
It takes a village - LegalTech NY 2011
It takes a village - LegalTech NY 2011It takes a village - LegalTech NY 2011
It takes a village - LegalTech NY 2011
 
eDiscovery and Records Oh...My!
eDiscovery and Records Oh...My!eDiscovery and Records Oh...My!
eDiscovery and Records Oh...My!
 
LegalTech Cross Border Disputes
LegalTech Cross Border DisputesLegalTech Cross Border Disputes
LegalTech Cross Border Disputes
 
Pardon the eDiscovery
Pardon the eDiscoveryPardon the eDiscovery
Pardon the eDiscovery
 
EMC SourceOne for SharePoint
EMC SourceOne for SharePointEMC SourceOne for SharePoint
EMC SourceOne for SharePoint
 
eDiscovery Turf Wars at LegalTech 2011
eDiscovery Turf Wars at LegalTech 2011eDiscovery Turf Wars at LegalTech 2011
eDiscovery Turf Wars at LegalTech 2011
 
Apps Preso
Apps PresoApps Preso
Apps Preso
 
Mc Carterand English 06152010 F
Mc Carterand English 06152010 FMc Carterand English 06152010 F
Mc Carterand English 06152010 F
 

Recently uploaded

Essential Tools for Modern PR Business .pptx
Essential Tools for Modern PR Business .pptxEssential Tools for Modern PR Business .pptx
Essential Tools for Modern PR Business .pptx
Pragencyuk
 
2015pmkemenhub163.pdf 2015pmkemenhub163.pdf
2015pmkemenhub163.pdf 2015pmkemenhub163.pdf2015pmkemenhub163.pdf 2015pmkemenhub163.pdf
2015pmkemenhub163.pdf 2015pmkemenhub163.pdf
CIkumparan
 
EED - The Container Port PERFORMANCE INDEX 2023
EED - The Container Port PERFORMANCE INDEX 2023EED - The Container Port PERFORMANCE INDEX 2023
EED - The Container Port PERFORMANCE INDEX 2023
El Estrecho Digital
 
Acolyte Episodes review (TV series)..pdf
Acolyte Episodes review (TV series)..pdfAcolyte Episodes review (TV series)..pdf
Acolyte Episodes review (TV series)..pdf
46adnanshahzad
 
What Ukraine Has Lost During Russia’s Invasion
What Ukraine Has Lost During Russia’s InvasionWhat Ukraine Has Lost During Russia’s Invasion
What Ukraine Has Lost During Russia’s Invasion
LUMINATIVE MEDIA/PROJECT COUNSEL MEDIA GROUP
 
Hindustan Insider 2nd edition release now
Hindustan Insider 2nd edition release nowHindustan Insider 2nd edition release now
Hindustan Insider 2nd edition release now
hindustaninsider22
 
04062024_First India Newspaper Jaipur.pdf
04062024_First India Newspaper Jaipur.pdf04062024_First India Newspaper Jaipur.pdf
04062024_First India Newspaper Jaipur.pdf
FIRST INDIA
 
Gabriel Whitley's Motion Summary Judgment
Gabriel Whitley's Motion Summary JudgmentGabriel Whitley's Motion Summary Judgment
Gabriel Whitley's Motion Summary Judgment
Abdul-Hakim Shabazz
 
Letter-from-ECI-to-MeiTY-21st-march-2024.pdf
Letter-from-ECI-to-MeiTY-21st-march-2024.pdfLetter-from-ECI-to-MeiTY-21st-march-2024.pdf
Letter-from-ECI-to-MeiTY-21st-march-2024.pdf
bhavenpr
 
Hogan Comes Home: an MIA WWII crewman is returned
Hogan Comes Home: an MIA WWII crewman is returnedHogan Comes Home: an MIA WWII crewman is returned
Hogan Comes Home: an MIA WWII crewman is returned
rbakerj2
 

Recently uploaded (10)

Essential Tools for Modern PR Business .pptx
Essential Tools for Modern PR Business .pptxEssential Tools for Modern PR Business .pptx
Essential Tools for Modern PR Business .pptx
 
2015pmkemenhub163.pdf 2015pmkemenhub163.pdf
2015pmkemenhub163.pdf 2015pmkemenhub163.pdf2015pmkemenhub163.pdf 2015pmkemenhub163.pdf
2015pmkemenhub163.pdf 2015pmkemenhub163.pdf
 
EED - The Container Port PERFORMANCE INDEX 2023
EED - The Container Port PERFORMANCE INDEX 2023EED - The Container Port PERFORMANCE INDEX 2023
EED - The Container Port PERFORMANCE INDEX 2023
 
Acolyte Episodes review (TV series)..pdf
Acolyte Episodes review (TV series)..pdfAcolyte Episodes review (TV series)..pdf
Acolyte Episodes review (TV series)..pdf
 
What Ukraine Has Lost During Russia’s Invasion
What Ukraine Has Lost During Russia’s InvasionWhat Ukraine Has Lost During Russia’s Invasion
What Ukraine Has Lost During Russia’s Invasion
 
Hindustan Insider 2nd edition release now
Hindustan Insider 2nd edition release nowHindustan Insider 2nd edition release now
Hindustan Insider 2nd edition release now
 
04062024_First India Newspaper Jaipur.pdf
04062024_First India Newspaper Jaipur.pdf04062024_First India Newspaper Jaipur.pdf
04062024_First India Newspaper Jaipur.pdf
 
Gabriel Whitley's Motion Summary Judgment
Gabriel Whitley's Motion Summary JudgmentGabriel Whitley's Motion Summary Judgment
Gabriel Whitley's Motion Summary Judgment
 
Letter-from-ECI-to-MeiTY-21st-march-2024.pdf
Letter-from-ECI-to-MeiTY-21st-march-2024.pdfLetter-from-ECI-to-MeiTY-21st-march-2024.pdf
Letter-from-ECI-to-MeiTY-21st-march-2024.pdf
 
Hogan Comes Home: an MIA WWII crewman is returned
Hogan Comes Home: an MIA WWII crewman is returnedHogan Comes Home: an MIA WWII crewman is returned
Hogan Comes Home: an MIA WWII crewman is returned
 

Jason Baron, Esq. and James Shook, Esq. - An Inevitable Reality: Machine-based eDiscovery Review

  • 1. An Inevitable Reality: Machine-based eDiscovery Review Jason R. Baron, Esq.Director of Litigation, National Archives and Records Administration jason.baron@nara.gov James D. Shook, Esq. Director, eDiscovery and Compliance Group, EMC Corporation jim.shook@emc.com
  • 2. The Problem Technologies and Techniques The "Unfolding Law“ & Current Research Question & Answer
  • 3. 1.8Zb Lots of It 95% Mostly Unstructured 85% Mostly Unmanaged 85% Created by Organizations ▲ Becoming More Regulated Information Today – The Big Picture Information
  • 4. A Legal Crossroads “[T]he legal profession is at a crossroads: the choice is between continuing to conduct discovery as it has ‘always been practiced’ in a paper world – before the advent of computers [and] the Internet . . . Or, alternatively, embracing new ways of thinking in today’s digital world.” The Sedona Conference, The Sedona Conference Commentary on Achieving Quality in the E-Discovery Process (2009)
  • 6.
  • 7. FINDING RESPONSIVE DOCUMENTS IN A LARGE DATA SET: FOUR LOGICAL CATEGORIES Not Relevant and Retrieved Relevant and Retrieved DOCUMENT SET FALSE POSITIVES Relevant and Not Retrieved Not Relevant and Not Retrieved FALSE NEGATIVES
  • 8. The Problem Technologies and Techniques The "Unfolding Law“ and Current Research Question & Answer
  • 9. Techniques Advanced Search Greater Interaction with Opposing Counsel Iterative, tiered and phased approach Project Management, Sampling, Quality Control Jason R. Baron, Law in the Age of Exabytes: Some Further Thoughts on ‘Information Inflation’ and Current Issues in e-Discovery Search, XVII Rich. J.L. & Tech. 9 (2011), http://jolt.richmond.edu/v17i3/article9.pdf
  • 10. 10 Technology Tools Greater Use Made of Boolean Strings Fuzzy Search Models Probabilistic models (Bayesian) Statistical methods (clustering) Machine learning approaches to semantic representation Categorization tools: taxonomies and ontologies Social network analysis Hybrid approaches Reference: Appendix to The Sedona Conference® Best Practices Commentary on the Use of Search and Information Retrieval Methods in E-Discovery (2007), available at http://www.thesedonaconference.org (link to publications)
  • 11. Emerging New Predictive Strategies Improved review and case assessment: cluster docs thru use of software with minimal human intervention at front end to code “seeded” data set Slide adapted from Gartner Conference June 23, 2010 Washington, D.C.
  • 12. The Problem Technologies and Techniques The “Unfolding Law” and Current Research Question & Answer
  • 13. Unfolding Law Fed. Rule Civ. P. 1 (aim is to secure the just, speedy, economical determination of every action) U.S. v. O’Keefe Victor Stanley I Privilege Concerns
  • 14. Judge Facciola writing for the U.S. District Court for the District of Columbia 14 “Whether search terms or ‘keywords’ will yield the information sought is a complicated question involving the interplay, at least, of the sciences of computer technology, statistics and linguistics. See George L. Paul & Jason R. BaronInformation Inflation: Can the Legal System Adapt?', 13 RICH. J.L. & TECH.. 10 (2007) * * * Given this complexity, for lawyers and judges to dare opine that a certain search term or terms would be more likely to produce information than the terms that were used is truly to go where angels fear to tread.” -- U.S. v. O'Keefe, 537 F.Supp.2d 14, 24 D.D.C. 2008).
  • 15. Judge Grimm writing for the U.S. District Court for the District of Maryland 15 “[W]hile it is universally acknowledged that keyword searches are useful tools for search and retrieval of ESI, all keyword searches are not created equal; and there is a growing body of literature that highlights the risks associated with conducting an unreliable or inadequate keyword search or relying on such searches for privilege review.” Victor Stanley, Inc. v. Creative Pipe, Inc., 250 F.R.D. 251 (D. Md. 2008); see id., text accompanying nn. 9 & 10 (citing to Sedona Search Commentary & TREC Legal Track research project)
  • 16. What is TREC? 16 Conference series co-sponsored by the National Institute of Standards and Technology (NIST) and the Advanced Research and Development Activity (ARDA) of the Department of Defense Designed to promote research into the science of information retrieval First TREC conference was in 1992 15th Conference held November 15-17, 2006 in U.S. in Gaithersburg, Maryland (NIST headquarters)
  • 17. 17 TREC Legal Track The TREC Legal Track was designed to evaluate the effectiveness of search technologies in a real-world legal context First of a kind study using nonproprietary data since Blair/Maron research in 1985 Hypothetical complaints and 100+ “requests to produce” drafted by members of The Sedona Conference® “Boolean negotiations” conducted as a baseline for search efforts Documents to be searched were drawn from a publicly available 7 million document tobacco litigation Master Settlement Agreement database New Interactive Task added in 2008 and continued in 2009 using Topic Authorities and a post-adjudication round In 2009, a second Enron data set was added as a separate task Participating teams of information scientists from around the world contributing computer runs, plus in 2008 thru 2011 from legal service providers Results from 2010 round currently being processed – will be posted on TREC website soon
  • 18.
  • 20.
  • 22.
  • 24.
  • 25. “Boolean” Searches May Miss A Large Percentage of Relevant Documents 78% of relevant documents were only found by some other technique Source: TREC 2007 Legal Track
  • 26. Interactive Task – Results from 2008 & 2009 Topic 102 (2008) Topic 103 (2008) Topic 104 (2008) Topic 201 (2009) Topic 202 (2009) Topic 203 (2009) Topic 204 (2009) Topic 205 (2009) Topic 206 (2009) Topic 207 (2009) Source: 2008/2009 TREC Legal Track
  • 27. An Inevitable Reality: Machine-based eDiscovery Review The Problem Technologies and Techniques The “Unfolding Law” and Current Research Question & Answer Jason R. Baron, Esq.Director of Litigation, National Archives and Records Administration jason.baron@nara.gov James D. Shook, Esq. Director, eDiscovery and Compliance Group, EMC Corporation jim.shook@emc.comwww.kazeon.com/blog
  • 28. Next Steps Best practices white papers, analyst papers and more… eDiscovery kazeon.com emc.com/ediscovery Information Governance emc.com/informationgovernance emc.com/SourceOneCity Upcoming events Masters Conference mastersconference.com Best Practices eDiscovery webcasts (EMC+Masters Conf) kazeon.com/newsroom2/webinars.php

Editor's Notes

  1. Key Issue: Why are search and other semantically based technologies the most-important ones in e-discovery?