The hottest topic in ediscovery continues to be the use of predictive analytics in making litigation more efficient, but why should we stop there? Can we take the lessons learned from the Da Silva Moore case about using new analytical tools and techniques and apply them to the information governance space? In this session, the founder of the TREC Legal Track and former Co-Chair of Working Group 1 of The Sedona Conference discusses how analytics may be used by law firms to add value to their client’s information governance issues, including with respect to business intelligence, e-record archiving, and record classification, retention and remediation.
4. We have entered the era where
Big Data is ….
(c) Jason R. Baron 2014
5. The World Has Changed
§ We are not just managing thousands or millions of paper files
§ We are at an inflection point in history in terms of data volume
§ IDC Report: 1800 new exabytes this year
(1 exabyte=data equivalent of 50,000 yrs of continuous movies)
§ Open data policies vs. “the iceberg”:
a vast amount of information is
“hidden” underneath the web —how is it
to be reliably preserved and accessed?
(c) Jason R. Baron 2013
6. (c) Jason R. Baron 2013
Reality:
The era of information inflation and Big Data in litigation has
just begun….
Lehman Brothers Investigation
— 350 billion page universe (3 petabytes)
— Examiner narrowed collection by selecting key
custodians, using dozens of Boolean searches
— Reviewed 5 million docs (40 million pages using 70
contract attorneys)
Source: Report of Anton R. Valukas, Examiner, In re Lehman Brothers Holdings Inc., et al., Chapter 11 Case
No. 08-13555 (U.S. Bankruptcy Ct. S.D.N.Y. March 11, 2010), Vol. 7, Appx. 5, at http://
lehmanreport.jenner.com/.
7. Information governance is needed in a world where . . .
- 80% of enterprise data is unstructured
- 60% of documents are obsolete
- 50% of documents are duplicate
- 80% documents are not retrieved by traditional search
(c) Jason R. Baron 2013
8. www.aiim.org/infochaos
Do
YOU
understand
the
business
challenge
of
the
next
10
years?
This
ebook
from
AIIM
President
John
Mancini
explains.
9. Traditional Document Review Processes
8
§ Labor intensive
§ Linear Review
§ Quality of manual coding for responsiveness open to question
(see RAND Study, 2012)
13. 12
Example of Boolean search string from
U.S. v. Philip Morris
§ (((master settlement agreement OR msa) AND
NOT (medical savings account OR metropolitan
standard area)) OR s. 1415 OR (ets AND NOT
educational testing service) OR (liggett AND NOT
sharon a. liggett) OR atco OR lorillard OR (pmi
AND NOT presidential management intern) OR
pm usa OR rjr OR (b&w AND NOT photo*) OR
phillip morris OR batco OR ftc test method OR
star scientific OR vector group OR joe camel OR
(marlboro AND NOT upper marlboro)) AND NOT
(tobacco* OR cigarette* OR smoking OR tar OR
nicotine OR smokeless OR synar amendment OR
philip morris OR r.j. reynolds OR ("brown and
williamson") OR ("brown & williamson") OR bat
industries OR liggett group)
14. Emerging New Strategies:
“Predictive Analytics”
Improved review and case
assessment: cluster docs thru
use of software with minimal
human intervention at front end
to code “seeded” data set Slide adapted from Gartner Conference
June 23, 2010 Washington, D.C.
(c) Jason R. Baron 2013
15. Defining “predictive coding” or
“TAR”
§ A process for prioritizing or coding a collection of electronic
documents using a computerized system that harnesses human
judgments of one or more subject matter experts on a smaller
set of documents and then extrapolates those judgments to the
remaining document population.
§ Also referred to as “supervised or active machine learning,”
“computer-assisted review” or “technology-assisted review”
Source: Adapted from Grossman-Cormack Glossary of Technology Assisted Review, v. 1.0 (Oct 2012)
(c) Jason R. Baron 2013
16. Judicial endorsement of predictive
analytics in document review by Judge
Peck in da Silva Moore v. Publicis
Groupe (SDNY Feb. 24, 2012)
This opinion appears to be the first in which a Court has approved of the
use of computer-assisted review. . . . What the Bar should take away from
this Opinion is that computer-assisted review is an available tool and
should be seriously considered for use in large-data-volume cases where
it may save the producing party (or both parties) significant amounts of
legal fees in document review. Counsel no longer have to worry about
being the ‘first’ or ‘guinea pig’ for judicial acceptance of computer-assisted
review . . . Computer-assisted review can now be considered judicially-
approved for use in appropriate cases.
(c) Jason R. Baron 2013
17. The da Silva Moore Protocol
• Supervised learning
• Random sampling
• Establishment of seed set
• Issue tags
• Iteration
• Random sampling of docs deemed irrelevant
(c) Jason R. Baron 2013
18. The demise of RM….
● John Mancini, President of AIIM:
• “If by traditional records management you mean
manual systems—even if they are computerized – then
I would say traditional records management is dead.
The idea that we could get busy people to care about
our complicated retention schedules, and drag and
drop documents into folders, and manually apply
metadata document by document according to an
elaborate taxonomy will soon seem as ridiculous as
asking a blacksmith to work on a Ferrari.”
(c) Jason R. Baron 2013
19. Process Optimization Problem: The
transactional toll of user-based
recordkeeping schemes (“as is” RM)
(c) Jason R. Baron 2013
20. …. and the need for better,
automated solutions ….
(c) Jason R. Baron 2013
21. Email is still
the 800 lb.
gorilla of
ediscovery
(c) Jason R. Baron 2013
22. Archivist/OMB Directive
● M-12-18, Managing Government Records
Directive, dated 8/24/12:
1.1 By 2019, Federal agencies will manage all
permanent records in an electronic format.
1.2 By 2016, Federal agencies will manage both
permanent and temporary email records in an
accessible electronic format.
http://www.whitehouse.gov/sites/default/files/omb/memoranda/2012/m-12-18.pdf
(c) Jason R. Baron 2013
23. NARA Moved to the Cloud for Email with
Embedded RM/Autocategorization
(c) Jason R. Baron 2013
24. Capstone Officials
Capstone officials may
include:
● Officials at or near the top of
an agency or an organizational
subcomponent
● Key staff members that may be
in positions that create or
receive presumptively
permanent email records
Capstone
accounts
Other
accounts
Key
staff
accounts
Other
accounts
(c) Jason R. Baron 2013
25. How To Avoid A Train Wreck With
Email Archiving….
Capture
E-‐mail
But
U:lize
Records
Management!
(c) Jason R. Baron 2013
26. 25
Can advanced analytics techniques and technologies,
including Auto-Categorization, Auto-redaction, Auto-
indexing, Auto-translation, etc., be applied and leveraged
by Records Managers/Information Governance types?
Yes, but ….
Information Governance / Records Analytics
27. Homage to Carl Linnaeus (1707-1778)
(c) Jason R. Baron 2013
28. Linnaean classification of the animal
kingdom§ Kingdom: Animalia
§ Phylum: Chordata
§ Subphylum: Vertebrata
§ Superclass: Tetrapoda
§ Class: Mammalia
§ Subclass: Theria
§ Infraclass: Eutheria
§ Cohort: Unguiculata
§ Order: Primata
§ Suborder: Anthropoidea
§ Superfamily: Hominoidae
§ Family: Hominidae
§ Subfamily: Homininae
§ Genus: Homo
§ Subgenus: Homo (Homo)
§ Specific epithet: sapiens
(c) Jason R. Baron 2013
30. The Coming Age of Dark Archives (and the
inability to provide access unless we have
smart ways of extracting signal from noise)
(c) Jason R. Baron 2013
31. We should be leveraging the power of
predictive analytics to improve
information governance . . .
-- RM: defensible disposal of low value information
-- Regulatory compliance
-- Risk mitigation – segregating sensitive materials…
(PII, proprietary, etc.)
-- Business intelligence
-- E-discovery
-- Collaboration across enterprise
-- Providing access to dark data & archives
(c) Jason R. Baron 2013
32. (c) Jason R. Baron 2013
IG &Analytics: True Life Stories “Ripped from the
Headlines”
§ The Case of the Wayward Would-Be Whisteblower
§ The Case of the Mistakenly Valued Merger & Acquisition
33. What is the IGI?
The IGI is a cross-disciplinary think tank and consortium
dedicated to advancing the adoption of Information Governance
practices and technologies through research, publishing,
advocacy, and peer-to-peer networking.
It provides industry thought leadership and benchmarking
designed to foster consensus and conversation
It is a connector among the stakeholders of information
governance
It is a promoter of industry best practices and standards
www.iginitiative.com
34. “The future is here. It is just not evenly
distributed.”
--William Gibson
(c) Jason R. Baron 2013
35. References
Sources Referencing Information Governance, Autocategorization & Predictive Coding
B. Borden & J.R. Baron, “Finding the Signal in the Noise: Information Governance, Analytics, and The Future of the
Law,” 20 Richmond J. Law & Technology 7 (2014), http://jolt.richmond.edu
J.R. Baron, “Law in the Age of Exabytes: Some Further Thoughts on ‘Information Inflation’ and Current Issues in E-
Discovery Search, 17 Richmond J. Law & Technology (2011), see http://jolt.richmond.edu
N. Pace, “Where The Money Goes: Understanding Litigant Expenditures for Producing E-Discovery,” RAND
Publication (2012), see http://www.rand.org/pubs/monographs/MG1208.html
TREC Legal Track Home Page, http://trec-legal.umiacs.umd.edu (includes bibliography for further reading)
The Sedona Conference®, The Sedona Conference Commentary on Information Governance (2013)
Latest “Supervised Learning/Predictive Coding” Case Law:
• Da Silva Moore v. Publicis Groupe, 2012 WL 607412 (S.D.N.Y. Feb. 24, 2012), approved and adopted
in Da Silva Moore v. Publicis Groupe, 2012 WL 1446534, at *2 (S.D.N.Y. Apr. 26, 2012)
• EORHB v HOA Holdings, Civ. No. 7409-VCL (Del. Ch. Oct. 15, 2012)
• Global Aerospace Inc., et al. v. Landow Aviation, L.P., et al., 2012 WL 1431215 (Va. Cir. Ct. Apr. 23, 2012).
• In re Actos (Pioglitazone) Products, 2012 WL 3899669 (W.D. La. July 27, 2012)
• Kleen Products, LLC v. Packaging Corp. of America, 10 C 5711 (N.D. Ill.) (Nolan, M.J.)
• In re Biomet M2a Magnum Hip Implant Products Liability Litigation, 3:12-MD-2391 (S.D. Ind.) (April 18,
2013)
(c) Jason R. Baron 2013
36. www.aiim.org/infochaos
Do
YOU
understand
the
business
challenge
of
the
next
10
years?
This
ebook
from
AIIM
President
John
Mancini
explains.
37. Jason R. Baron
Of Counsel
Drinker Biddle & Reath LLP
1500 K Street, N.W.
Washington, D.C. 20005
(202) 230-5196
Email: jason.baron@dbr.com
(c) Jason R. Baron 2014