Collection Technology 
Professor Richard Adams 
University of Western Australia
Digital forensics – 10 years ago
Covert operations were challenging
Dealing with multiple computers - even more challenging!
Things have moved on
In-house skills 
Every large organisation has access to IT professionals who are able to collect data from their own networks for electronic discovery, internal digital forensic investigations and incident response purposes with the right tools and the appropriate training. 
Unfortunately, with regard to the collection process, the current tools are limited in their application.
What are the current collection tool limitations? 
ESI collection tools – These usually result in large quantities of data needlessly being dragged across the network to be processed centrally (such as indexing) and then most of the data being discarded - this impacts the networks, increases the time taken to complete the collection and adds to the costs for the business. 
Digital forensic collection tools – These do not scale well in the business environment and tie a person to the process because they can’t be automated. Remote collection features are limited.
Digital forensic considerations 
There is no legal requirement to capture a full bit-for-bit image of a device in order to provide potential evidence in court 
In the majority of cases the documents that are significant in a case are not found in ‘free space’ but are intact files (although they may have been undeleted with intact metadata) 
Courts are starting to push back on electronic evidence that cannot be attributed to a particular person (such as some items found in unallocated space) 
Live acquisitions are now much more commonplace in a corporate environment in order to capture RAM and encrypted data as well as address increasing data volumes
Ideal situation – digital forensic investigation 
On a matter in which many people could be involved you would capture a ‘forensic’ image of each machine for processing later in a forensic tool. This collection process would typically involve either: 
(a) one person being physically located with each target machine or 
(b) the image being captured across the network and requiring an operator to connect to each machine
Considerations - One person being physically located with each target machine 
Do you have enough trained staff to acquire each machine? 
Do you have enough time to acquire the images? 
What if the machines are in remote locations? 
Do you have enough equipment for each machine? (write- blockers/dongles/boot discs/storage drives) 
What if this needs to be done covertly? 
For multiple machines spread across different sites/countries this is not a realistic scenario to contemplate for an organisation.
Considerations - the image being captured across the network and requiring an operator to connect to each machine 
Can the network handle the load? 
Is the network fast enough? 
Can you prevent interference with the target machine during the operation? 
Do you have enough time to collect this way? 
Experience shows that few organisations have the network capacity to handle multiple collections in this fashion in a timely manner
Ideal functionality – ESI collection tool 
From any machine on the network identify an unlimited number of target machines and start processing on those machines based on pre- defined selection criteria that includes keywords and phrases 
Exclude file types and directories from searching 
All files matching the selection criteria (including emails, compressed files and unknown file types) must be collected 
All data (including the selection criteria) is encrypted 
Only collect files that match the selection criteria 
Unicode, UTF, ASCII search capability 
Minimise disruption of the target machine users 
Suspend power-saving settings 
Suspend defined processes 
Output for processing on any review platform
Alternative ‘ideal’ situation for a digital forensic investigation 
Deploy the ‘ideal’ ESI tool but with added functionality: 
Capture RAM 
Capture Pagefile 
Capture Swapfile 
Capture Hibernation file 
Capture the Windows Registry 
Identify scanned documents that can’t be text-searched 
Covert capabilities
So what functionality does a common tool need?
Technology features that can make the ideal ESI/Forensic collection tool possible 
Running purely in memory on the target 
Searching and extracting emails from OSTs, PSTs, etc. that are in use 
Searching through unknown file types on the target 
Collecting system files from a running machine 
Searching and extracting data from compressed files 
Command-based or hidden interface capabilities 
Identifying scanned documents 
Undeleting files on the target 
Collect details of running processes 
Suspend defined processes 
Input criteria for review tools well established for designing an API 
The ability to re-start the process if interrupted 
The ability to notify on completion/failure 
The ability to undertake plain text searches at the disk level rather than at the file system level
Potential Scenarios for a Forensic Discovery (FD) Tool
Network deployment 
Define selection criteria and the storage location for collected data in a ‘configuration’ file 
Identify target machines 
Create a network share 
Load FD tool and configuration files into network share 
Assign target machines to a group 
Create a script to load the FD tool from the share when any target machine is connected to the network 
Receive email when each target collection is completed and then review/process the collected data as appropriate
Individual machine deployment - overt 
Define selection criteria and the storage location for collected data in a ‘configuration’ file 
Load FD tool and configuration files onto the required number of external collection devices (e.g. USB backup drives) 
Provide the collection devices to any staff member (such as the user of the target machine) for them to attach to the target machine 
Provide instructions to run the FD tool from the collection device 
Receive email on completion 
Instruct the staff member to unplug and return the collection device
Individual machine deployment - covert 
Define selection criteria and the storage location for collected data in a ‘configuration’ file 
Load FD tool and configuration files onto the required number of external collection devices (e.g. USB backup drives) 
Provide the collection devices to authorised staff for them to attach to the target machine(s) out of office hours 
Provide instructions to run the FD tool from the collection device (alternatively RDP to the target(s) and run the FD tool) 
Receive email(s) on completion 
Instruct the staff member(s) to unplug and return the collection device
Benefits of an FD tool 
COST – reduction in data collected means a reduction in collection costs and a consequential reduction in processing costs 
RESOURCES – Remove the requirement for skilled staff to be tied up in the collection process 
INFRASTUCTURE – Reduce the impact on networks by dramatically reducing the amount of data transferred 
SPEED – By using the target machines for processing the total collection time is reduced to the time of the slowest machine 
COMPLETENESS – by undertaking plain text searches at the disk level rather than at the file system level all data is searched rather than a limited number of file types
Questions? 
Toggle between an eDiscovery and a Digital Forensic collection with pre-set options?
Proof of concept:- a plain text search of a live machine 
Looking for ANY files on a remote target with ‘ttest’ in them 
(a statistics reference)
Setting collection options
Add search term
Exclusions
Overt completion message
Notification via email
Initial review
‘Unknown’ file types

Fusing digital forensics, electronic discovery and incident response

  • 1.
    Collection Technology ProfessorRichard Adams University of Western Australia
  • 2.
  • 3.
  • 4.
    Dealing with multiplecomputers - even more challenging!
  • 5.
  • 6.
    In-house skills Everylarge organisation has access to IT professionals who are able to collect data from their own networks for electronic discovery, internal digital forensic investigations and incident response purposes with the right tools and the appropriate training. Unfortunately, with regard to the collection process, the current tools are limited in their application.
  • 7.
    What are thecurrent collection tool limitations? ESI collection tools – These usually result in large quantities of data needlessly being dragged across the network to be processed centrally (such as indexing) and then most of the data being discarded - this impacts the networks, increases the time taken to complete the collection and adds to the costs for the business. Digital forensic collection tools – These do not scale well in the business environment and tie a person to the process because they can’t be automated. Remote collection features are limited.
  • 8.
    Digital forensic considerations There is no legal requirement to capture a full bit-for-bit image of a device in order to provide potential evidence in court In the majority of cases the documents that are significant in a case are not found in ‘free space’ but are intact files (although they may have been undeleted with intact metadata) Courts are starting to push back on electronic evidence that cannot be attributed to a particular person (such as some items found in unallocated space) Live acquisitions are now much more commonplace in a corporate environment in order to capture RAM and encrypted data as well as address increasing data volumes
  • 9.
    Ideal situation –digital forensic investigation On a matter in which many people could be involved you would capture a ‘forensic’ image of each machine for processing later in a forensic tool. This collection process would typically involve either: (a) one person being physically located with each target machine or (b) the image being captured across the network and requiring an operator to connect to each machine
  • 10.
    Considerations - Oneperson being physically located with each target machine Do you have enough trained staff to acquire each machine? Do you have enough time to acquire the images? What if the machines are in remote locations? Do you have enough equipment for each machine? (write- blockers/dongles/boot discs/storage drives) What if this needs to be done covertly? For multiple machines spread across different sites/countries this is not a realistic scenario to contemplate for an organisation.
  • 11.
    Considerations - theimage being captured across the network and requiring an operator to connect to each machine Can the network handle the load? Is the network fast enough? Can you prevent interference with the target machine during the operation? Do you have enough time to collect this way? Experience shows that few organisations have the network capacity to handle multiple collections in this fashion in a timely manner
  • 12.
    Ideal functionality –ESI collection tool From any machine on the network identify an unlimited number of target machines and start processing on those machines based on pre- defined selection criteria that includes keywords and phrases Exclude file types and directories from searching All files matching the selection criteria (including emails, compressed files and unknown file types) must be collected All data (including the selection criteria) is encrypted Only collect files that match the selection criteria Unicode, UTF, ASCII search capability Minimise disruption of the target machine users Suspend power-saving settings Suspend defined processes Output for processing on any review platform
  • 13.
    Alternative ‘ideal’ situationfor a digital forensic investigation Deploy the ‘ideal’ ESI tool but with added functionality: Capture RAM Capture Pagefile Capture Swapfile Capture Hibernation file Capture the Windows Registry Identify scanned documents that can’t be text-searched Covert capabilities
  • 14.
    So what functionalitydoes a common tool need?
  • 15.
    Technology features thatcan make the ideal ESI/Forensic collection tool possible Running purely in memory on the target Searching and extracting emails from OSTs, PSTs, etc. that are in use Searching through unknown file types on the target Collecting system files from a running machine Searching and extracting data from compressed files Command-based or hidden interface capabilities Identifying scanned documents Undeleting files on the target Collect details of running processes Suspend defined processes Input criteria for review tools well established for designing an API The ability to re-start the process if interrupted The ability to notify on completion/failure The ability to undertake plain text searches at the disk level rather than at the file system level
  • 16.
    Potential Scenarios fora Forensic Discovery (FD) Tool
  • 17.
    Network deployment Defineselection criteria and the storage location for collected data in a ‘configuration’ file Identify target machines Create a network share Load FD tool and configuration files into network share Assign target machines to a group Create a script to load the FD tool from the share when any target machine is connected to the network Receive email when each target collection is completed and then review/process the collected data as appropriate
  • 18.
    Individual machine deployment- overt Define selection criteria and the storage location for collected data in a ‘configuration’ file Load FD tool and configuration files onto the required number of external collection devices (e.g. USB backup drives) Provide the collection devices to any staff member (such as the user of the target machine) for them to attach to the target machine Provide instructions to run the FD tool from the collection device Receive email on completion Instruct the staff member to unplug and return the collection device
  • 19.
    Individual machine deployment- covert Define selection criteria and the storage location for collected data in a ‘configuration’ file Load FD tool and configuration files onto the required number of external collection devices (e.g. USB backup drives) Provide the collection devices to authorised staff for them to attach to the target machine(s) out of office hours Provide instructions to run the FD tool from the collection device (alternatively RDP to the target(s) and run the FD tool) Receive email(s) on completion Instruct the staff member(s) to unplug and return the collection device
  • 20.
    Benefits of anFD tool COST – reduction in data collected means a reduction in collection costs and a consequential reduction in processing costs RESOURCES – Remove the requirement for skilled staff to be tied up in the collection process INFRASTUCTURE – Reduce the impact on networks by dramatically reducing the amount of data transferred SPEED – By using the target machines for processing the total collection time is reduced to the time of the slowest machine COMPLETENESS – by undertaking plain text searches at the disk level rather than at the file system level all data is searched rather than a limited number of file types
  • 21.
    Questions? Toggle betweenan eDiscovery and a Digital Forensic collection with pre-set options?
  • 22.
    Proof of concept:-a plain text search of a live machine Looking for ANY files on a remote target with ‘ttest’ in them (a statistics reference)
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.