Ph.D. Candidate in Computer Sciences
University of Brescia, dept. of Information Engineering
April 10, 2014
Branch of forensics science that studies the
identification, extraction and analysis of digital
data for use in a court of law.
In the beginning (from
the 80s until now) it
was all about
They were all (almost)
alike, and there were
plenty of standard
In the last 5-10 years everything began to
store digital data.
Use of specific tools
eBook Reader Forensics
Voyage Data Recorder Forensics
What do these devices have in common?
• Modern devices which contain digital data
• Their data could be required during an investigation
• No consolidated literature about them
The rationale behind this thesis is the ever-
growing need to perform digital
investigations on devices and systems that
have not already been studied from this point
What can we find in an iOS
device and how can we bring it
to a court...
Mobile and tablet worldwide market share of operating system usage for
November 2013. Net Market Share collects browser data from a
worldwide network of over 40,000 websites. (Credit: Net Market Share)
There is no simple way to extract data from an
No easy way to access its contents
without jailbreaking (which, by the way,
Encrypted filesystem (HFS+)
Not sharing anything with the rest of the World
No debug interfaces
Easiest way to peek inside the filesystem: the
Backup files are organized in a
hierarchy, the first level of it
being the «Domain»:
• Media domain: media files,
mms attachments, …
• Keychain domain: account
data and encrypted
• Home domain: data for
standard apps (contacts, mail
client, calendars, …)
• Wireless domain: data about
the telephone system (call
logs, connection logs, …)
PLIST Files (plain
text and binary)
data is stored in «Apps»
domain (for third party
applications) or «Home»
domain (for standard
The hierarchy of each
follows a standard
with Webkit offline
Thumbnails: generated from the media gallery for fast visualization
Address book data (Home domain)
Knowing about the data
location and structure is the
Next step: making it easily
usable for the ones who need
iPBA2 is a tool
Study the backup
Make it easier to
Right now it is the only complete open source suite for analysing
iOS backup data, and it is used by both researchers and
practitioners from all over the world.
Why an eBook reader is not
worthless in a forensics
• Because is a widely used digital device.
• Because it holds digital data.
• Because no piece of data can be deemed
«worthless» in advance during an investigation.
• Because almost any practitioner says it’s
worthless… which by the way it’s not.
Locard’s exchange principle
"Wherever he steps, whatever he touches, whatever
he leaves, even unconsciously, will serve as a silent
witness against him. Not only his fingerprints or his
footprints, but his hair, the fibers from his clothes,
the glass he breaks, the tool mark he leaves, the
paint he scratches, the blood or semen he deposits
or collects. […]"
Forensics profiling refers to the study and
exploitation of traces in order to draw a profile
relevant to the investigation about criminal or
While traces may not be strictly dedicated to a
court use, they may increase knowledge of the
subject under investigation.
For our research, we chose a widely available
modern device, the PRS-650 by Sony.
Of course, many of our results can probably
be achieved after further studies also with
different devices from different vendors.
• E-paper display (6 inches, 800x600).
• Resistive touchscreen.
• 5 buttons.
• Montavista Linux.
• 2GB internal flash memory.
• Removable SDHC and Memory Stick PRO
Books, documents, images,
Current position of documents.
Notes (written and audio).
Last reading of a document.
Pages read for each document.
Everything has a timestamp!
We can access the main storage by USB storage interface
For the whole device..
For each document…
For each document:
• current position (page)
• timestamp of the last access
For each document:
• History of the last 100
page turns, with page
number and timestamp.
To perform the analysis, we build a Python script which parses
cache.xml, media.xml and cacheExt.xml and build a graph of the
interactions between the user and the device.
The script extracts the timestamps and produces a data file with all the
timestamps found, to be plotted on a timeline.
eBook reader usage in a two-months time span.
• X axis: time
• Y axis: ID of the document involved
Usage of the reader in a ten-minutes span, for a single book.
• X axis: time
Virtually each action performed on the device
It is possible to build a forensically sound
The evidence gathered this way could be used
in court to:
◦ Draw a behavioural profile of a suspected offender.
◦ Support or deny an alibi.
◦ Provide additional useful information about the
Ship automation Echo sounder
And much more...
The Voyage Data Recorder (VDR)
is a mandatory device for all
medium-to-big sized modern
Its job is to keep a record of ship
data to be used in an accident
• Position, speed, heading
• Date and time
• Radar plot
• Audio from bridge and VHF
• Sonar depth
• Hull openings (watertight doors, fire
• Rudder position, propellers speed
• Meteo station data (wind, ...)
• Onboard alarms
Data collecting unit
An industrial computer which
collects all data and temporalily
stores it in a magnetic disk.
Final Recording Medium
A rugged box containing a solid-
state memory, designed to
survive a catastrophic accident
and be recovered for further
Starting point: the
complete copy of the
internal disk of the
data collecting unit.
Analysis of the disk structure.
Mounting the partition
Analysis of the disk content: the «frame» directory
Unknown data files
The same goes for the «NMEA» directory.
∼800 MB of ASCII data in NMEA format
NMEA 0183 is a data exchange protocol used primarily in the
navigation field. It is the preferred way to exchange data between
• $: starting character.
• PREFIX: origin and type of data
• First 2 characters: originating device
• Other 3 characters: type of sentence
• Checksum: 2-digit hex XOR of the whole sentence.
$PREFIX, data0, data1, …, dataN*CHECKSUM
NMEA sentences are standard, but vendors are allowed to add
custom ones for specific purposes.
Timestamp: Unix time
= 4F 10 88 90 (hex)
= 1’326’483’600 (dec)
= Jan 13, 2012 @ 19:40:00 UTC
= Jan 13, 2012 @ 20:40:00 local time (UTC+1)
Example of standard sentence:
RA: origin (radar)
ZDA: date and time
-01: difference between local time and UTC
Example of non standard sentence:
P: non-standard prefix
S: vendor (Seanet)
WTD: watertight doors
07: door number
C-----: door status (closed, no warnings)
Once we were able to recover the raw data, we
proceeded to work on it to:
Understand the meaning of the standard and
Understand the relative importance of each
Build tools to parse the data and report the
results in a useful format.
Position of the
Evolution of the
Why does the
last signal we
have for door 8
reads ‘O’ (open)?
the impact by
The steps we described are related to this specific
VDR model, but they also show a general approach
which could probably be applied, with further
studies, to any other model and vendor.
The analysis of the VDR data is of course easy to
perform with closed and proprietary software from
the vendor, but we were the first to publish about a
forensically sound approach.