SlideShare a Scribd company logo
1 of 37
Download to read offline
AIIM Conference 2014
Orlando, FL
April 2, 2014
Jason R. Baron, Esq.
Information Governance and eDiscovery Group
Drinker Biddle & Reath LLP
Washington, D.C. 20005
© Jason R. Baron 2014
Finding The Signal in the
Noise: Bringing Predictive
Analytics To the Information
Governance Space
(c) Jason R. Baron 2013
We have entered the era where
Big Data is ….
(c) Jason R. Baron 2014
The World Has Changed
§  We are not just managing thousands or millions of paper files
§  We are at an inflection point in history in terms of data volume
§  IDC Report: 1800 new exabytes this year
(1 exabyte=data equivalent of 50,000 yrs of continuous movies)
§  Open data policies vs. “the iceberg”:
a vast amount of information is
“hidden” underneath the web —how is it
to be reliably preserved and accessed?
(c) Jason R. Baron 2013
(c) Jason R. Baron 2013
Reality:
The era of information inflation and Big Data in litigation has
just begun….
Lehman Brothers Investigation
—  350 billion page universe (3 petabytes)
—  Examiner narrowed collection by selecting key
custodians, using dozens of Boolean searches
—  Reviewed 5 million docs (40 million pages using 70
contract attorneys)
Source: Report of Anton R. Valukas, Examiner, In re Lehman Brothers Holdings Inc., et al., Chapter 11 Case
No. 08-13555 (U.S. Bankruptcy Ct. S.D.N.Y. March 11, 2010), Vol. 7, Appx. 5, at http://
lehmanreport.jenner.com/.
Information governance is needed in a world where . . .
-  80% of enterprise data is unstructured
-  60% of documents are obsolete
-  50% of documents are duplicate
-  80% documents are not retrieved by traditional search
(c) Jason R. Baron 2013
www.aiim.org/infochaos	
  
Do	
  YOU	
  understand	
  the	
  business	
  	
  
challenge	
  of	
  the	
  next	
  10	
  years?	
  
This	
  ebook	
  from	
  AIIM	
  President	
  
John	
  Mancini	
  explains.	
  
Traditional Document Review Processes
8
§  Labor intensive
§  Linear Review
§  Quality of manual coding for responsiveness open to question
(see RAND Study, 2012)
9
Searching the Haystack….
10
to find relevant needles…
False
Positives
Relevant
Smoking
Policy Emails
OMB
VP Chief
of Staff
Ron Klain
Office of
the U.S.
Trade
Rep.
White
House
Counsel
12
Example of Boolean search string from
U.S. v. Philip Morris
§  (((master settlement agreement OR msa) AND
NOT (medical savings account OR metropolitan
standard area)) OR s. 1415 OR (ets AND NOT
educational testing service) OR (liggett AND NOT
sharon a. liggett) OR atco OR lorillard OR (pmi
AND NOT presidential management intern) OR
pm usa OR rjr OR (b&w AND NOT photo*) OR
phillip morris OR batco OR ftc test method OR
star scientific OR vector group OR joe camel OR
(marlboro AND NOT upper marlboro)) AND NOT
(tobacco* OR cigarette* OR smoking OR tar OR
nicotine OR smokeless OR synar amendment OR
philip morris OR r.j. reynolds OR ("brown and
williamson") OR ("brown & williamson") OR bat
industries OR liggett group)
Emerging New Strategies:
“Predictive Analytics”
Improved review and case
assessment: cluster docs thru
use of software with minimal
human intervention at front end
to code “seeded” data set Slide adapted from Gartner Conference
June 23, 2010 Washington, D.C.
(c) Jason R. Baron 2013
Defining “predictive coding” or
“TAR”
§  A process for prioritizing or coding a collection of electronic
documents using a computerized system that harnesses human
judgments of one or more subject matter experts on a smaller
set of documents and then extrapolates those judgments to the
remaining document population.
§  Also referred to as “supervised or active machine learning,”
“computer-assisted review” or “technology-assisted review”
Source: Adapted from Grossman-Cormack Glossary of Technology Assisted Review, v. 1.0 (Oct 2012)
(c) Jason R. Baron 2013
Judicial endorsement of predictive
analytics in document review by Judge
Peck in da Silva Moore v. Publicis
Groupe (SDNY Feb. 24, 2012)
This opinion appears to be the first in which a Court has approved of the
use of computer-assisted review. . . . What the Bar should take away from
this Opinion is that computer-assisted review is an available tool and
should be seriously considered for use in large-data-volume cases where
it may save the producing party (or both parties) significant amounts of
legal fees in document review. Counsel no longer have to worry about
being the ‘first’ or ‘guinea pig’ for judicial acceptance of computer-assisted
review . . . Computer-assisted review can now be considered judicially-
approved for use in appropriate cases.
(c) Jason R. Baron 2013
The da Silva Moore Protocol
• Supervised learning
•  Random sampling
•  Establishment of seed set
• Issue tags
•  Iteration
•  Random sampling of docs deemed irrelevant
(c) Jason R. Baron 2013
The demise of RM….
● John Mancini, President of AIIM:
• “If by traditional records management you mean
manual systems—even if they are computerized – then
I would say traditional records management is dead.
The idea that we could get busy people to care about
our complicated retention schedules, and drag and
drop documents into folders, and manually apply
metadata document by document according to an
elaborate taxonomy will soon seem as ridiculous as
asking a blacksmith to work on a Ferrari.”
(c) Jason R. Baron 2013
Process Optimization Problem: The
transactional toll of user-based
recordkeeping schemes (“as is” RM)
(c) Jason R. Baron 2013
…. and the need for better,
automated solutions ….
(c) Jason R. Baron 2013
Email is still
the 800 lb.
gorilla of
ediscovery
(c) Jason R. Baron 2013
Archivist/OMB Directive
● M-12-18, Managing Government Records
Directive, dated 8/24/12:
1.1 By 2019, Federal agencies will manage all
permanent records in an electronic format.
1.2 By 2016, Federal agencies will manage both
permanent and temporary email records in an
accessible electronic format.
http://www.whitehouse.gov/sites/default/files/omb/memoranda/2012/m-12-18.pdf
(c) Jason R. Baron 2013
NARA Moved to the Cloud for Email with
Embedded RM/Autocategorization
(c) Jason R. Baron 2013
Capstone Officials
Capstone officials may
include:
●  Officials at or near the top of
an agency or an organizational
subcomponent
●  Key staff members that may be
in positions that create or
receive presumptively
permanent email records
Capstone	
  
accounts	
  
Other	
  
accounts	
  
Key	
  staff	
  
accounts	
  
Other	
  
accounts	
  
(c) Jason R. Baron 2013
How To Avoid A Train Wreck With
Email Archiving….
Capture	
  E-­‐mail	
  But	
  U:lize	
  Records	
  Management!	
  
(c) Jason R. Baron 2013
25
Can advanced analytics techniques and technologies,
including Auto-Categorization, Auto-redaction, Auto-
indexing, Auto-translation, etc., be applied and leveraged
by Records Managers/Information Governance types?
Yes, but ….
Information Governance / Records Analytics
Homage to Carl Linnaeus (1707-1778)
(c) Jason R. Baron 2013
Linnaean classification of the animal
kingdom§  Kingdom: Animalia
§  Phylum: Chordata
§  Subphylum: Vertebrata
§  Superclass: Tetrapoda
§  Class: Mammalia
§  Subclass: Theria
§  Infraclass: Eutheria
§  Cohort: Unguiculata
§  Order: Primata
§  Suborder: Anthropoidea
§  Superfamily: Hominoidae
§  Family: Hominidae
§  Subfamily: Homininae
§  Genus: Homo
§  Subgenus: Homo (Homo)
§  Specific epithet: sapiens
(c) Jason R. Baron 2013
Which category?
(c) Jason R. Baron 2013
The Coming Age of Dark Archives (and the
inability to provide access unless we have
smart ways of extracting signal from noise)
(c) Jason R. Baron 2013
We should be leveraging the power of
predictive analytics to improve
information governance . . .
-- RM: defensible disposal of low value information
-- Regulatory compliance
-- Risk mitigation – segregating sensitive materials…
(PII, proprietary, etc.)
-- Business intelligence
-- E-discovery
-- Collaboration across enterprise
-- Providing access to dark data & archives
(c) Jason R. Baron 2013
(c) Jason R. Baron 2013
IG &Analytics: True Life Stories “Ripped from the
Headlines”
§  The Case of the Wayward Would-Be Whisteblower
§  The Case of the Mistakenly Valued Merger & Acquisition
What is the IGI?
The IGI is a cross-disciplinary think tank and consortium
dedicated to advancing the adoption of Information Governance
practices and technologies through research, publishing,
advocacy, and peer-to-peer networking.
It provides industry thought leadership and benchmarking
designed to foster consensus and conversation
It is a connector among the stakeholders of information
governance
It is a promoter of industry best practices and standards
www.iginitiative.com
“The future is here. It is just not evenly
distributed.”
--William Gibson
(c) Jason R. Baron 2013
References
Sources Referencing Information Governance, Autocategorization & Predictive Coding
B. Borden & J.R. Baron, “Finding the Signal in the Noise: Information Governance, Analytics, and The Future of the
Law,” 20 Richmond J. Law & Technology 7 (2014), http://jolt.richmond.edu
J.R. Baron, “Law in the Age of Exabytes: Some Further Thoughts on ‘Information Inflation’ and Current Issues in E-
Discovery Search, 17 Richmond J. Law & Technology (2011), see http://jolt.richmond.edu
N. Pace, “Where The Money Goes: Understanding Litigant Expenditures for Producing E-Discovery,” RAND
Publication (2012), see http://www.rand.org/pubs/monographs/MG1208.html
TREC Legal Track Home Page, http://trec-legal.umiacs.umd.edu (includes bibliography for further reading)
The Sedona Conference®, The Sedona Conference Commentary on Information Governance (2013)
Latest “Supervised Learning/Predictive Coding” Case Law:
•  Da Silva Moore v. Publicis Groupe, 2012 WL 607412 (S.D.N.Y. Feb. 24, 2012), approved and adopted
in Da Silva Moore v. Publicis Groupe, 2012 WL 1446534, at *2 (S.D.N.Y. Apr. 26, 2012)
•  EORHB v HOA Holdings, Civ. No. 7409-VCL (Del. Ch. Oct. 15, 2012)
•  Global Aerospace Inc., et al. v. Landow Aviation, L.P., et al., 2012 WL 1431215 (Va. Cir. Ct. Apr. 23, 2012).
•  In re Actos (Pioglitazone) Products, 2012 WL 3899669 (W.D. La. July 27, 2012)
•  Kleen Products, LLC v. Packaging Corp. of America, 10 C 5711 (N.D. Ill.) (Nolan, M.J.)
•  In re Biomet M2a Magnum Hip Implant Products Liability Litigation, 3:12-MD-2391 (S.D. Ind.) (April 18,
2013)
(c) Jason R. Baron 2013
www.aiim.org/infochaos	
  
Do	
  YOU	
  understand	
  the	
  business	
  	
  
challenge	
  of	
  the	
  next	
  10	
  years?	
  
This	
  ebook	
  from	
  AIIM	
  President	
  
John	
  Mancini	
  explains.	
  
Jason R. Baron
Of Counsel
Drinker Biddle & Reath LLP
1500 K Street, N.W.
Washington, D.C. 20005
(202) 230-5196
Email: jason.baron@dbr.com
(c) Jason R. Baron 2014

More Related Content

What's hot

What's hot (10)

Big data Paper
Big data PaperBig data Paper
Big data Paper
 
Why We Are Open Sourcing ContraxSuite and Some Thoughts About Legal Tech and ...
Why We Are Open Sourcing ContraxSuite and Some Thoughts About Legal Tech and ...Why We Are Open Sourcing ContraxSuite and Some Thoughts About Legal Tech and ...
Why We Are Open Sourcing ContraxSuite and Some Thoughts About Legal Tech and ...
 
Business at Your Library
Business at Your LibraryBusiness at Your Library
Business at Your Library
 
Your Mind: Legal Status, Rights, and Securing Yourself
Your Mind: Legal Status, Rights, and Securing YourselfYour Mind: Legal Status, Rights, and Securing Yourself
Your Mind: Legal Status, Rights, and Securing Yourself
 
23 ijcse-01238-1indhunisha
23 ijcse-01238-1indhunisha23 ijcse-01238-1indhunisha
23 ijcse-01238-1indhunisha
 
The technical case for a semantic web
The technical case for a semantic webThe technical case for a semantic web
The technical case for a semantic web
 
Digital Landfill
Digital LandfillDigital Landfill
Digital Landfill
 
Transcript - DOIs to support citation of grey literature
Transcript - DOIs to support citation of grey literatureTranscript - DOIs to support citation of grey literature
Transcript - DOIs to support citation of grey literature
 
IDRS Frequently Asked Questions (FAQs)
IDRS Frequently Asked Questions (FAQs)IDRS Frequently Asked Questions (FAQs)
IDRS Frequently Asked Questions (FAQs)
 
Legal Analytics Course - Class 11 - Network Analysis and Law - Professors Dan...
Legal Analytics Course - Class 11 - Network Analysis and Law - Professors Dan...Legal Analytics Course - Class 11 - Network Analysis and Law - Professors Dan...
Legal Analytics Course - Class 11 - Network Analysis and Law - Professors Dan...
 

Similar to All Needle, No Haystack: Bring Predictive Analysis to Information Governance

Why Is Dna Important In Criminal Investigation
Why Is Dna Important In Criminal InvestigationWhy Is Dna Important In Criminal Investigation
Why Is Dna Important In Criminal Investigation
Julie Kwhl
 
June 2015 (142) MIS Quarterly Executive 67The Big Dat.docx
June 2015 (142)  MIS Quarterly Executive   67The Big Dat.docxJune 2015 (142)  MIS Quarterly Executive   67The Big Dat.docx
June 2015 (142) MIS Quarterly Executive 67The Big Dat.docx
croysierkathey
 
Blue Rubin Task Force Presentation - Digital Preservation
Blue Rubin Task Force Presentation - Digital PreservationBlue Rubin Task Force Presentation - Digital Preservation
Blue Rubin Task Force Presentation - Digital Preservation
Peter Mojica
 

Similar to All Needle, No Haystack: Bring Predictive Analysis to Information Governance (20)

Data mining with big data implementation
Data mining with big data implementationData mining with big data implementation
Data mining with big data implementation
 
Information Governance -- Necessary Evil or a Bridge to the Future?
Information Governance -- Necessary Evil or a Bridge to the Future?Information Governance -- Necessary Evil or a Bridge to the Future?
Information Governance -- Necessary Evil or a Bridge to the Future?
 
Achieving Regulatory Compliance The Devil Is In The Data Governance
Achieving Regulatory Compliance   The Devil Is In The Data GovernanceAchieving Regulatory Compliance   The Devil Is In The Data Governance
Achieving Regulatory Compliance The Devil Is In The Data Governance
 
Heavy, Messy, Misleading: why Big Data is a human problem, not a tech one
Heavy, Messy, Misleading: why Big Data is a human problem, not a tech oneHeavy, Messy, Misleading: why Big Data is a human problem, not a tech one
Heavy, Messy, Misleading: why Big Data is a human problem, not a tech one
 
Why Is Dna Important In Criminal Investigation
Why Is Dna Important In Criminal InvestigationWhy Is Dna Important In Criminal Investigation
Why Is Dna Important In Criminal Investigation
 
Big Data
Big DataBig Data
Big Data
 
A Short History of Big Data
A Short History of Big DataA Short History of Big Data
A Short History of Big Data
 
It's not the documents; it's the DATA
It's not the documents; it's the DATAIt's not the documents; it's the DATA
It's not the documents; it's the DATA
 
Big data assignment
Big data assignmentBig data assignment
Big data assignment
 
Data Journalism for Business Reporting
Data Journalism for Business ReportingData Journalism for Business Reporting
Data Journalism for Business Reporting
 
Big Data Analytics and Open Data
Big Data Analytics and Open Data Big Data Analytics and Open Data
Big Data Analytics and Open Data
 
Creating a Data-Driven Government: Big Data With Purpose
Creating a Data-Driven Government: Big Data With PurposeCreating a Data-Driven Government: Big Data With Purpose
Creating a Data-Driven Government: Big Data With Purpose
 
Spark Social Media
Spark Social Media Spark Social Media
Spark Social Media
 
June 2015 (142) MIS Quarterly Executive 67The Big Dat.docx
June 2015 (142)  MIS Quarterly Executive   67The Big Dat.docxJune 2015 (142)  MIS Quarterly Executive   67The Big Dat.docx
June 2015 (142) MIS Quarterly Executive 67The Big Dat.docx
 
The REAL Impact of Big Data on Privacy
The REAL Impact of Big Data on PrivacyThe REAL Impact of Big Data on Privacy
The REAL Impact of Big Data on Privacy
 
Blue Rubin Task Force Presentation - Digital Preservation
Blue Rubin Task Force Presentation - Digital PreservationBlue Rubin Task Force Presentation - Digital Preservation
Blue Rubin Task Force Presentation - Digital Preservation
 
Heavy, Messy, Misleading: How Big Data is a human problem, not a tech one
Heavy, Messy, Misleading: How Big Data is a human problem, not a tech oneHeavy, Messy, Misleading: How Big Data is a human problem, not a tech one
Heavy, Messy, Misleading: How Big Data is a human problem, not a tech one
 
Using Data Riches A tale of two projects - Ajay Vinze
Using Data Riches A tale of two projects - Ajay VinzeUsing Data Riches A tale of two projects - Ajay Vinze
Using Data Riches A tale of two projects - Ajay Vinze
 
Achieving Regulatory Compliance The Devil Is In The Data Governance V2
Achieving Regulatory Compliance   The Devil Is In The Data Governance V2Achieving Regulatory Compliance   The Devil Is In The Data Governance V2
Achieving Regulatory Compliance The Devil Is In The Data Governance V2
 
Dull, Difficult, and Essential: Managing Public Records
Dull,  Difficult,  and Essential: Managing Public RecordsDull,  Difficult,  and Essential: Managing Public Records
Dull, Difficult, and Essential: Managing Public Records
 

More from AIIM International

More from AIIM International (20)

2022 IIM Infographic.pptx
2022 IIM Infographic.pptx2022 IIM Infographic.pptx
2022 IIM Infographic.pptx
 
Create, Capture, Collaborate - Your Content Drives Organizational Value
Create, Capture, Collaborate - Your Content Drives Organizational ValueCreate, Capture, Collaborate - Your Content Drives Organizational Value
Create, Capture, Collaborate - Your Content Drives Organizational Value
 
Meet the Expert Panel - 2021 State of the Intelligent Information Management ...
Meet the Expert Panel - 2021 State of the Intelligent Information Management ...Meet the Expert Panel - 2021 State of the Intelligent Information Management ...
Meet the Expert Panel - 2021 State of the Intelligent Information Management ...
 
[Webinar Slides] Maximizing Workforce Capacity - Proven Practices for Saving ...
[Webinar Slides] Maximizing Workforce Capacity - Proven Practices for Saving ...[Webinar Slides] Maximizing Workforce Capacity - Proven Practices for Saving ...
[Webinar Slides] Maximizing Workforce Capacity - Proven Practices for Saving ...
 
[Webinar Slides] When Your Current Systems No Longer Help You Do Your Job, It...
[Webinar Slides] When Your Current Systems No Longer Help You Do Your Job, It...[Webinar Slides] When Your Current Systems No Longer Help You Do Your Job, It...
[Webinar Slides] When Your Current Systems No Longer Help You Do Your Job, It...
 
[Webinar Slides] Information Access and Information Control: Two Cloud Conten...
[Webinar Slides] Information Access and Information Control: Two Cloud Conten...[Webinar Slides] Information Access and Information Control: Two Cloud Conten...
[Webinar Slides] Information Access and Information Control: Two Cloud Conten...
 
[Webinar Slides] Data Privacy for the IM Practitioner - Practical Advice for ...
[Webinar Slides] Data Privacy for the IM Practitioner - Practical Advice for ...[Webinar Slides] Data Privacy for the IM Practitioner - Practical Advice for ...
[Webinar Slides] Data Privacy for the IM Practitioner - Practical Advice for ...
 
[Webinar Slides] New Approaches to Classification and Retention for Organizat...
[Webinar Slides] New Approaches to Classification and Retention for Organizat...[Webinar Slides] New Approaches to Classification and Retention for Organizat...
[Webinar Slides] New Approaches to Classification and Retention for Organizat...
 
[Webinar Slides] Driving Digital Change With O365 & Intelligent Information M...
[Webinar Slides] Driving Digital Change With O365 & Intelligent Information M...[Webinar Slides] Driving Digital Change With O365 & Intelligent Information M...
[Webinar Slides] Driving Digital Change With O365 & Intelligent Information M...
 
[Webinar Slides] Working Faster and Smarter in a Digital Transforming World W...
[Webinar Slides] Working Faster and Smarter in a Digital Transforming World W...[Webinar Slides] Working Faster and Smarter in a Digital Transforming World W...
[Webinar Slides] Working Faster and Smarter in a Digital Transforming World W...
 
[Webinar Slides] Using AI to Easily Automate All of Your Correspondence Channels
[Webinar Slides] Using AI to Easily Automate All of Your Correspondence Channels[Webinar Slides] Using AI to Easily Automate All of Your Correspondence Channels
[Webinar Slides] Using AI to Easily Automate All of Your Correspondence Channels
 
[Webinar Slides] Capture Leaders & Their Projects: We Asked, They Answered. D...
[Webinar Slides] Capture Leaders & Their Projects: We Asked, They Answered. D...[Webinar Slides] Capture Leaders & Their Projects: We Asked, They Answered. D...
[Webinar Slides] Capture Leaders & Their Projects: We Asked, They Answered. D...
 
[Webinar Slides] eSignatures: Learn How This Technology Can Revolutionize You...
[Webinar Slides] eSignatures: Learn How This Technology Can Revolutionize You...[Webinar Slides] eSignatures: Learn How This Technology Can Revolutionize You...
[Webinar Slides] eSignatures: Learn How This Technology Can Revolutionize You...
 
[Webinar Slides] Your 2019 Information Management Resolution: Part Two
[Webinar Slides] Your 2019 Information Management Resolution: Part Two[Webinar Slides] Your 2019 Information Management Resolution: Part Two
[Webinar Slides] Your 2019 Information Management Resolution: Part Two
 
[Webinar Slides] Data Explosion in Your Organization? Harness It with a Compr...
[Webinar Slides] Data Explosion in Your Organization? Harness It with a Compr...[Webinar Slides] Data Explosion in Your Organization? Harness It with a Compr...
[Webinar Slides] Data Explosion in Your Organization? Harness It with a Compr...
 
[Webinar Slides] It All Starts Here— Effectively Capturing Paper and Digital ...
[Webinar Slides] It All Starts Here— Effectively Capturing Paper and Digital ...[Webinar Slides] It All Starts Here— Effectively Capturing Paper and Digital ...
[Webinar Slides] It All Starts Here— Effectively Capturing Paper and Digital ...
 
[Webinar Slides] Improving your Organization’s Collaborative and Case-Centric...
[Webinar Slides] Improving your Organization’s Collaborative and Case-Centric...[Webinar Slides] Improving your Organization’s Collaborative and Case-Centric...
[Webinar Slides] Improving your Organization’s Collaborative and Case-Centric...
 
[Webinar Slides] Modern Problems Require Modern Solutions
[Webinar Slides] Modern Problems Require Modern Solutions[Webinar Slides] Modern Problems Require Modern Solutions
[Webinar Slides] Modern Problems Require Modern Solutions
 
[Webinar Slides] Dreading Your Data Migration Project? 3 Ways Robotic Process...
[Webinar Slides] Dreading Your Data Migration Project? 3 Ways Robotic Process...[Webinar Slides] Dreading Your Data Migration Project? 3 Ways Robotic Process...
[Webinar Slides] Dreading Your Data Migration Project? 3 Ways Robotic Process...
 
[AIIM18] Beyond Human Capacity: Using analytics to scale your everyday inform...
[AIIM18] Beyond Human Capacity: Using analytics to scale your everyday inform...[AIIM18] Beyond Human Capacity: Using analytics to scale your everyday inform...
[AIIM18] Beyond Human Capacity: Using analytics to scale your everyday inform...
 

Recently uploaded

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Recently uploaded (20)

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 

All Needle, No Haystack: Bring Predictive Analysis to Information Governance

  • 1. AIIM Conference 2014 Orlando, FL April 2, 2014 Jason R. Baron, Esq. Information Governance and eDiscovery Group Drinker Biddle & Reath LLP Washington, D.C. 20005 © Jason R. Baron 2014 Finding The Signal in the Noise: Bringing Predictive Analytics To the Information Governance Space
  • 2.
  • 3. (c) Jason R. Baron 2013
  • 4. We have entered the era where Big Data is …. (c) Jason R. Baron 2014
  • 5. The World Has Changed §  We are not just managing thousands or millions of paper files §  We are at an inflection point in history in terms of data volume §  IDC Report: 1800 new exabytes this year (1 exabyte=data equivalent of 50,000 yrs of continuous movies) §  Open data policies vs. “the iceberg”: a vast amount of information is “hidden” underneath the web —how is it to be reliably preserved and accessed? (c) Jason R. Baron 2013
  • 6. (c) Jason R. Baron 2013 Reality: The era of information inflation and Big Data in litigation has just begun…. Lehman Brothers Investigation —  350 billion page universe (3 petabytes) —  Examiner narrowed collection by selecting key custodians, using dozens of Boolean searches —  Reviewed 5 million docs (40 million pages using 70 contract attorneys) Source: Report of Anton R. Valukas, Examiner, In re Lehman Brothers Holdings Inc., et al., Chapter 11 Case No. 08-13555 (U.S. Bankruptcy Ct. S.D.N.Y. March 11, 2010), Vol. 7, Appx. 5, at http:// lehmanreport.jenner.com/.
  • 7. Information governance is needed in a world where . . . -  80% of enterprise data is unstructured -  60% of documents are obsolete -  50% of documents are duplicate -  80% documents are not retrieved by traditional search (c) Jason R. Baron 2013
  • 8. www.aiim.org/infochaos   Do  YOU  understand  the  business     challenge  of  the  next  10  years?   This  ebook  from  AIIM  President   John  Mancini  explains.  
  • 9. Traditional Document Review Processes 8 §  Labor intensive §  Linear Review §  Quality of manual coding for responsiveness open to question (see RAND Study, 2012)
  • 11. 10 to find relevant needles…
  • 12. False Positives Relevant Smoking Policy Emails OMB VP Chief of Staff Ron Klain Office of the U.S. Trade Rep. White House Counsel
  • 13. 12 Example of Boolean search string from U.S. v. Philip Morris §  (((master settlement agreement OR msa) AND NOT (medical savings account OR metropolitan standard area)) OR s. 1415 OR (ets AND NOT educational testing service) OR (liggett AND NOT sharon a. liggett) OR atco OR lorillard OR (pmi AND NOT presidential management intern) OR pm usa OR rjr OR (b&w AND NOT photo*) OR phillip morris OR batco OR ftc test method OR star scientific OR vector group OR joe camel OR (marlboro AND NOT upper marlboro)) AND NOT (tobacco* OR cigarette* OR smoking OR tar OR nicotine OR smokeless OR synar amendment OR philip morris OR r.j. reynolds OR ("brown and williamson") OR ("brown & williamson") OR bat industries OR liggett group)
  • 14. Emerging New Strategies: “Predictive Analytics” Improved review and case assessment: cluster docs thru use of software with minimal human intervention at front end to code “seeded” data set Slide adapted from Gartner Conference June 23, 2010 Washington, D.C. (c) Jason R. Baron 2013
  • 15. Defining “predictive coding” or “TAR” §  A process for prioritizing or coding a collection of electronic documents using a computerized system that harnesses human judgments of one or more subject matter experts on a smaller set of documents and then extrapolates those judgments to the remaining document population. §  Also referred to as “supervised or active machine learning,” “computer-assisted review” or “technology-assisted review” Source: Adapted from Grossman-Cormack Glossary of Technology Assisted Review, v. 1.0 (Oct 2012) (c) Jason R. Baron 2013
  • 16. Judicial endorsement of predictive analytics in document review by Judge Peck in da Silva Moore v. Publicis Groupe (SDNY Feb. 24, 2012) This opinion appears to be the first in which a Court has approved of the use of computer-assisted review. . . . What the Bar should take away from this Opinion is that computer-assisted review is an available tool and should be seriously considered for use in large-data-volume cases where it may save the producing party (or both parties) significant amounts of legal fees in document review. Counsel no longer have to worry about being the ‘first’ or ‘guinea pig’ for judicial acceptance of computer-assisted review . . . Computer-assisted review can now be considered judicially- approved for use in appropriate cases. (c) Jason R. Baron 2013
  • 17. The da Silva Moore Protocol • Supervised learning •  Random sampling •  Establishment of seed set • Issue tags •  Iteration •  Random sampling of docs deemed irrelevant (c) Jason R. Baron 2013
  • 18. The demise of RM…. ● John Mancini, President of AIIM: • “If by traditional records management you mean manual systems—even if they are computerized – then I would say traditional records management is dead. The idea that we could get busy people to care about our complicated retention schedules, and drag and drop documents into folders, and manually apply metadata document by document according to an elaborate taxonomy will soon seem as ridiculous as asking a blacksmith to work on a Ferrari.” (c) Jason R. Baron 2013
  • 19. Process Optimization Problem: The transactional toll of user-based recordkeeping schemes (“as is” RM) (c) Jason R. Baron 2013
  • 20. …. and the need for better, automated solutions …. (c) Jason R. Baron 2013
  • 21. Email is still the 800 lb. gorilla of ediscovery (c) Jason R. Baron 2013
  • 22. Archivist/OMB Directive ● M-12-18, Managing Government Records Directive, dated 8/24/12: 1.1 By 2019, Federal agencies will manage all permanent records in an electronic format. 1.2 By 2016, Federal agencies will manage both permanent and temporary email records in an accessible electronic format. http://www.whitehouse.gov/sites/default/files/omb/memoranda/2012/m-12-18.pdf (c) Jason R. Baron 2013
  • 23. NARA Moved to the Cloud for Email with Embedded RM/Autocategorization (c) Jason R. Baron 2013
  • 24. Capstone Officials Capstone officials may include: ●  Officials at or near the top of an agency or an organizational subcomponent ●  Key staff members that may be in positions that create or receive presumptively permanent email records Capstone   accounts   Other   accounts   Key  staff   accounts   Other   accounts   (c) Jason R. Baron 2013
  • 25. How To Avoid A Train Wreck With Email Archiving…. Capture  E-­‐mail  But  U:lize  Records  Management!   (c) Jason R. Baron 2013
  • 26. 25 Can advanced analytics techniques and technologies, including Auto-Categorization, Auto-redaction, Auto- indexing, Auto-translation, etc., be applied and leveraged by Records Managers/Information Governance types? Yes, but …. Information Governance / Records Analytics
  • 27. Homage to Carl Linnaeus (1707-1778) (c) Jason R. Baron 2013
  • 28. Linnaean classification of the animal kingdom§  Kingdom: Animalia §  Phylum: Chordata §  Subphylum: Vertebrata §  Superclass: Tetrapoda §  Class: Mammalia §  Subclass: Theria §  Infraclass: Eutheria §  Cohort: Unguiculata §  Order: Primata §  Suborder: Anthropoidea §  Superfamily: Hominoidae §  Family: Hominidae §  Subfamily: Homininae §  Genus: Homo §  Subgenus: Homo (Homo) §  Specific epithet: sapiens (c) Jason R. Baron 2013
  • 29. Which category? (c) Jason R. Baron 2013
  • 30. The Coming Age of Dark Archives (and the inability to provide access unless we have smart ways of extracting signal from noise) (c) Jason R. Baron 2013
  • 31. We should be leveraging the power of predictive analytics to improve information governance . . . -- RM: defensible disposal of low value information -- Regulatory compliance -- Risk mitigation – segregating sensitive materials… (PII, proprietary, etc.) -- Business intelligence -- E-discovery -- Collaboration across enterprise -- Providing access to dark data & archives (c) Jason R. Baron 2013
  • 32. (c) Jason R. Baron 2013 IG &Analytics: True Life Stories “Ripped from the Headlines” §  The Case of the Wayward Would-Be Whisteblower §  The Case of the Mistakenly Valued Merger & Acquisition
  • 33. What is the IGI? The IGI is a cross-disciplinary think tank and consortium dedicated to advancing the adoption of Information Governance practices and technologies through research, publishing, advocacy, and peer-to-peer networking. It provides industry thought leadership and benchmarking designed to foster consensus and conversation It is a connector among the stakeholders of information governance It is a promoter of industry best practices and standards www.iginitiative.com
  • 34. “The future is here. It is just not evenly distributed.” --William Gibson (c) Jason R. Baron 2013
  • 35. References Sources Referencing Information Governance, Autocategorization & Predictive Coding B. Borden & J.R. Baron, “Finding the Signal in the Noise: Information Governance, Analytics, and The Future of the Law,” 20 Richmond J. Law & Technology 7 (2014), http://jolt.richmond.edu J.R. Baron, “Law in the Age of Exabytes: Some Further Thoughts on ‘Information Inflation’ and Current Issues in E- Discovery Search, 17 Richmond J. Law & Technology (2011), see http://jolt.richmond.edu N. Pace, “Where The Money Goes: Understanding Litigant Expenditures for Producing E-Discovery,” RAND Publication (2012), see http://www.rand.org/pubs/monographs/MG1208.html TREC Legal Track Home Page, http://trec-legal.umiacs.umd.edu (includes bibliography for further reading) The Sedona Conference®, The Sedona Conference Commentary on Information Governance (2013) Latest “Supervised Learning/Predictive Coding” Case Law: •  Da Silva Moore v. Publicis Groupe, 2012 WL 607412 (S.D.N.Y. Feb. 24, 2012), approved and adopted in Da Silva Moore v. Publicis Groupe, 2012 WL 1446534, at *2 (S.D.N.Y. Apr. 26, 2012) •  EORHB v HOA Holdings, Civ. No. 7409-VCL (Del. Ch. Oct. 15, 2012) •  Global Aerospace Inc., et al. v. Landow Aviation, L.P., et al., 2012 WL 1431215 (Va. Cir. Ct. Apr. 23, 2012). •  In re Actos (Pioglitazone) Products, 2012 WL 3899669 (W.D. La. July 27, 2012) •  Kleen Products, LLC v. Packaging Corp. of America, 10 C 5711 (N.D. Ill.) (Nolan, M.J.) •  In re Biomet M2a Magnum Hip Implant Products Liability Litigation, 3:12-MD-2391 (S.D. Ind.) (April 18, 2013) (c) Jason R. Baron 2013
  • 36. www.aiim.org/infochaos   Do  YOU  understand  the  business     challenge  of  the  next  10  years?   This  ebook  from  AIIM  President   John  Mancini  explains.  
  • 37. Jason R. Baron Of Counsel Drinker Biddle & Reath LLP 1500 K Street, N.W. Washington, D.C. 20005 (202) 230-5196 Email: jason.baron@dbr.com (c) Jason R. Baron 2014