The volume of data being created continues to increase and, whilst some of this material often plays an increasingly key role in determining the outcome of disputes, the legal cost and time budgets allocated to allow for the review of potentially responsive material will often not be increased accordingly.
This presentation will explore how using investigative and battlefield data analytics practices can get the most out of your early case assessment exercise and allow you to see the bigger picture faster whilst conducting a defensible search and review of the available data.
You will learn:
- how visualisations can help with prioritisation
- how Nuix can automatically draw links between your data to draw your attention to key evidence
- to perform a qualitative review rather than an exhaustive review
Who will benefit most?
Incidence Responders, Counter-Terrorism Analysts, Internal Fraud/Employee Misconduct Investigators, tiered Reviewers, time-critical Data Analysts, everyone whose datasets have outgrown their capacity!
Advanced Machine Learning for Business Professionals
Nuix webinar presentation: See the bigger picture faster – early case assessment (ECA) best practices
1. See the Bigger Picture Faster
Early Case Assessment Best Practices
2. 23 November 2016 Copyright Nuix 2016 2
Presenters
Aidan Jewell, Solutions Consultant, Nuix
Aidan joined Nuix in 2014, bringing a decade of digital forensic investigation experience to
the EMEA team. As a Solutions Consultant, Aidan is responsible for pre and post sales
technical consultation, in addition to sharing his Nuix and investigations experience and
expertise with clients through workshops and the Nuix Bytes YouTube channel.
Carl Barron, Senior Solutions Consultant, Nuix
Carl has joined the company in March 2012. He provides pre and post-sale consultancy,
technical support and solution implementation. Carl brings a wide variety of knowledge in
both hardware and software with an enthusiast approach to help customers improve
workflows. Prior to joining Nuix, Carl worked as a Forensic Technician for a leading
Litigation Support Vendor in London.
3. 23 November 2016 Copyright Nuix 2016 3
Session Agenda
• Introduction
• Outline of current problem (Data Volumes)
• What is ECA?
• Benefits of ECA
• Tiered Processing
• Early Access & Collaboration
• Visuals
• Advanced ECA Features
• Summary
5. 23 November 2016 Copyright Nuix 2016 5
Data volumes and filing in 1986
1986 – back in the good old days…
• Dictate, approve, and send and perhaps 50 documents per day
• All documents received and carbon copies of documents sent were filed
• We had desk diaries
• Some firms kept a central book for attendance notes of important
discussions
• In a couple of days you could read into the documents – involving up to,
say, 2,000 documents - 2 metres of shelf space
6. 23 November 2016 Copyright Nuix 2016 6
Data volumes and filing in 2016
2016 – surrounded by technology…
• Send and receive by email hundreds of documents each day, with
still larger volumes of material coming in via SFTP
• Copies are saved all over the place (and on multiple devices)
• Yet more lurking “in the Cloud”
• Jebb Bush’s email dump – 1,800,000 emails - over a kilometre of
shelf space
7. 23 November 2016 Copyright Nuix 2016 7
Data everywhere
1 Email from me to you…
8. 23 November 2016 Copyright Nuix 2016 8
Data everywhere
1 Email from me to you…~12 copies
9. Copyright Nuix 2015 923 November 2016
Data Volume
Year 2000
=
20GB Hard Drive 6 Rooms
10. Copyright Nuix 2015 1023 November 2016
Data Volume
Year 2016
=
1TB Hard Drive 300 Rooms
12. 23 November 2016 Copyright Nuix 2016 12
What is ECA?
Definition
• An industry-specific term generally used to describe a variety of tools or methods for investigating and
quickly learning about a Document Collection for the purposes of estimating the risk(s) and cost(s) of
pursuing a particular legal course of action. 1
• A widely abused term in which corporate data is sifted and categorised with a view to determining an
organisation's exposure in the context of a dispute. The best ECA systems allow the sifting to take place
within a corporation's own data store and can be used to drill down rapidly to identify the most pertinent
evidentiary material and to facilitate decisions whether to litigate or settle. 2
1.Maura R. Grossman and Gordon V. Cormack, EDRM page & The Grossman-Cormack Glossary of Technology-Assisted Review, with Foreword by John M. Facciola, U.S. Magistrate
Judge, 2013 Fed. Cts. L. Rev. 7 (January 2013). ↩
2.LitSavant Ltd., Glossary, http://www.litsavant.com/full-glossary.aspx ↩
14. 23 November 2016 Copyright Nuix 2016 14
Why ECA?
• Case Strategy
• Reduce Risk
• Reduce Cost
• Fight or settle?
• Drive into facts of the data
• Proactively manage litigation
15. 23 November 2016 Copyright Nuix 2016 15
Proportionality
• Budgets are limited
• Courts increasingly keen to avoid traditional, standard, disclosure
• Need to cull multiple copies
• Equally, where appropriate, ensure the full history of documents is
recovered
• Involving forensic experts to collect the documents is expensive and feels
like “overkill” (and is both expensive and disruptive)
16. 23 November 2016 Copyright Nuix 2016 16
Early Case Assessment
• Often just a simple investigation
• Over 95% of disputes settle rather than proceed to a hearing
• The key issues are always the same:
– Resource
– Investigate further or stop?
– Fight or flee?
17. 23 November 2016 Copyright Nuix 2016 17
Early Case Assessment
• Numbers, Statistics & Predicting the cost of review
• Investigative Review
• Drive into facts of the data
• Fight or settle?
• Transition into review after
• Case Strategy
20. Copyright Nuix 2016 2023 November 2016
Tiered Processing
Tier 1
Tier 2
Tier 3
Tier 4
Metadata and Thumbnails
- Identify key files/exhibits/timelines for deeper processing
- 80-90% of the total files (no logs, for example)
Process Text, Extract Entities, Near Duplication
- Performed on tagged items (documents, communications etc.)
- 20-40% of the total files
Forensics
- Analyse registry, slack space etc.
- 1-5% of the total files
Carving
- Smart carving of unallocated clusters
- 1% of the total files
90-95% of Cases
finish here
21. Copyright Nuix 2016 2123 November 2016
Sample Tier 1 Processing Settings
In the ‘MIME Type Filtering’ tab deselect the following:
Spreadsheets CSV files (deselect Descendants)
System Files Microsoft Registry Decoded Data
Microsoft Registry Key
Containers Java Archive
Microsoft Registry File
No Data Inaccessible Content
Logs All
22. Copyright Nuix 2016 2223 November 2016
Sample Tier 2 Processing Settings
These settings will be run across only those files selected
for deeper analysis. This will populate the Full Text Indices
for those files, as well as allow for Near Duplicate
highlighting, entity extraction and analysis/linking, and
enhanced multimedia filtering.
In the ‘MIME Type Filtering’ tab deselect the following:
Spreadsheets CSV files (deselect Descendants)
System Files Microsoft Registry Decoded Data
Microsoft Registry Key
Containers Java Archive
Microsoft Registry File
No Data Inaccessible Content
Logs All
23. Copyright Nuix 2016 2323 November 2016
Sample Tier 3 Processing Settings
These settings are designed to bring registry analysis and file slack
examination into the investigation, only for those exhibits that
require this deeper level of interrogation.
It also prepares the Unallocated Clusters for intelligent carving by
hashing them.
In the ‘MIME Type Filtering’ tab TICK the following:
System Files Microsoft Registry Decoded Data
Microsoft Registry Key
Containers Microsoft Registry File
Depending on the investigation, you may wish to also TICK:
Containers Java Archive
No Data Inaccessible Content
Logs All
24. Copyright Nuix 2016 2423 November 2016
Sample Tier 4 Processing Settings
This final tier is for intelligent carving of Unallocated
Clusters.
By identifying and selecting only those ‘chunks’ of UC that
contain data (via hash comparison), carving can be
accomplished 60-80% quicker than if you were to run
carving over all of the UC.
25. Copyright Nuix 2016 2523 November 2016
Quality Checking Your Data
Corrupted Items/Containers
May also contain encrypted TrueCrypt containers
Non-searchable PDFs
PDFs with no text layer!
Bad Extension
Where the file extension doesn’t match the signature
Encrypted
Files/containers Nuix believes to be encrypted
Not Processed
Poisoned Items
Items that cause workers to get stuck in a loop
27. 23 November 2016 Copyright Nuix 2016 27
Early Access & Collaboration
Early Case Assessment
28. 23 November 2016 Copyright Nuix 2016 28
Early Access & Collaboration
“Victorious warriors win first and then go to
war, while defeated warriors go to war first
and then seek to win.”
― Sun Tzu
29. 23 November 2016 Copyright Nuix 2016 29
Early Access & Collaboration
Index
Data
Export
Data
Import
Data
Review
Data
NUIX WORKSTATION
NUIX DIRECTOR
REVIEW PLATFORM
EXPORT + REPORT
30. 23 November 2016 Copyright Nuix 2016 30
Early Access & Collaboration
Index
Data/ECA
Review
Data
NUIX WORKSTATION
NUIX DIRECTOR
NUIX WEB REVIEW & ANALYTICS
31. 23 November 2016 Copyright Nuix 2016 31
Early Access & Collaboration
34. Copyright Nuix 2016 3423 November 2016
Visualisation
[1] Ben Shneiderman, “Research Agenda: Visual Overviews for Exploratory Search”, National Science Foundation workshop on Information Seeking Support Systems, June 26-27, 2008
“The purpose of visualisation is insight, not pictures.” [1]
36. Copyright Nuix 2016 3623 November 2016
Visualisation
Analysing Minard's Visualisation Of Napoleon's 1812 March
https://robots.thoughtbot.com/analyzing-minards-visualization-of-napoleons-1812-march
37. 23 November 2016 Copyright Nuix 2016 37
Visualisation
• What does this tell us?
– Lots of data
– Comms in 2000, 2004, 2014
– Lots of recipients
• Much more context
– 2 key communicators
– 3 separate networks
Can this inform better
analysis & review?
38. 23 November 2016 Copyright Nuix 2016 38
Visualisation
• A quick look reveals
– 4 primary sources
– Connect money values
– 3 Countries
– 3 Companies
• Did we expect this?
• Can this inform better
analysis & review?
46. 23 November 2016 Copyright Nuix 2015 46
DEMO
Automatic identification of relevant information
Visualise Links between items/suspects (Pulling a
thread)
52. Copyright Nuix 2016 5223 November 2016
Search and Tag
Allows Nuix to automatically tag
items respondent to queries
Can import/share pre-
defined S&T templates
in CSV format
53. Copyright Nuix 2016 5323 November 2016
Digest/Hash Lists
Digest Lists
Automatically identify files in your
dataset that match by MD5
Shingle Lists
Automatically identify near-duplicates
Word Lists
Automatically identify files containing
keywords
Fuzzy Hash Lists
Compares SSDeep hashes to identify
potential malware
54. Copyright Nuix 2016 5423 November 2016
Automatic Classifiers – Predictive Coding
Nuix can learn how you tag
items, and once it has built up
a sufficient model, can use
that to automatically tag un-
reviewed items.
56. 23 November 2016 Copyright Nuix 2016 56
Early Case Assessment
Five Practical Tips for Data Analytics in Early Case Assessment
1. Find Out What You Have
2. Look for Issues in the Data
3. Learn what your key players hold
4. Answer the Who, What and When
5. Reduce the noise
57. 23 November 2016 Copyright Nuix 2016 57
Summary
• The challenge to investigate and come to quick conclusions is will
always exist.
• The traditional approach - reading everything - is no longer an option
• The intermediate solution of coming up with keywords fails as
volumes of data continue to increase – proportionality..
• The ability of Nuix to ingest data from multiple sources, filter out
duplicates and irrelevant - and home in on the relevant – material
make it an indispensible investigation and review tool
59. Your way forward
Nuix training courses are designed to help
you unlock the full potential of your Nuix
investment and achieve great results, fast.
View our course options online at:
nuix.com/training
Right tool + right way = right results faster