SlideShare a Scribd company logo
Semantically-enabled Digital
Investigations
by Spyridon Dosis
Outline
• Problem
• Background
• Developed Method
• Demonstration
• Conclusions
2015-05-17 ISACA Dagen 2013
Problem Area
• Complex attacks against
networked systems
• Multiple data sources of possible
evidentiary value
– Volume & Variety
– ”looking for a needle in a stack of
needles” – Paul Pillar, CIA CoA
• Analysis of the collected digital
data
– Least formalized process step
– Rely on investigators’ expertise and
experience
2015-05-17 ISACA Dagen 2013
Digital Evidence / Investigations
• Reliable digital data that support
hypothesizing about a security
incident
• Sound methods for collecting and
interpreting digital data
• Reconstruct events found to be
criminal (DF)
• Investigate and learn from
information security breaches (IR)2015-05-17 ISACA Dagen 2013
Forensic Tools
• Interpreters between data
abstraction layers
– e.g. Reconstruct raw disk data into
filesystem hierarchy and objects (files,
directories)
• Evidence- but not investigation-
centric design
• Limited tool interoperability
– Manual integration of tool findings
– Multiple (proprietary, undocumented)
data formats/models
2015-05-17 ISACA Dagen 2013
A Digital Investigation Example
ISACA Dagen 20132015-05-17
Semantic Web & Linked Data
Technologies
• ”… information is given well-defined
meaning, better enabling computers
and people to work in cooperation” –
(Tim Berners Lee, 2001)
• Ontology – ”explicit and formal
specification of a conceptualization”
– Entities, attributes, relationships
• Metadata - Context-based or domain-
specific annotation of data
• Reason and inference of implicit facts
2015-05-17 ISACA Dagen 2013
Semantic Web Architecture
• URI/IRI enables global data object
identification
• XML provides a machine readable,
validatable data encoding scheme
• RDF(S) is a metadata data model and
knowledge representation language
– Subject-Property-Object/Value statements
– Class and Property hierarchies
• OWL 2 is a more expressive KR
language for specifying ontologies
– Restrictions, Equivalence, Cardinality,
Property Chains
• Rule and RDF-query languages
2015-05-17 ISACA Dagen 2013
Method Overview
2015-05-17 ISACA Dagen 2013
Data Collection
Semantic
Representation
Ontological
Reasoning
Rule-based
Reasoning
Integrated
Query
Domain Ontologies
• Introduced a set of lightweight domain-specific OWL
ontologies
– Storage Media
– Network Traffic
– Windows Firewall Log, WHOIS RIR DB
– Malicious Networks Reputation List
– Malware Detection
2015-05-17 ISACA Dagen 2013
Evidence Representation (Graph)
2015-05-17 ISACA Dagen 2013
Semantic Representation
• Resource Unique Identification Scheme
• Parsing tools able to process each source type with
respect to the domain ontology
2015-05-17 ISACA Dagen 2013
Evidence Integration
• Automated linking among (homo/hetero-)geneous evidence
sources based on key properties & matching rules
2015-05-17 ISACA Dagen 2013
Evidence Correlation
• Link instances of dissimilar
type across a shared
domain
• Temporal Correlation
– Rules for establishing time
instant & interval relations
among recovered artifacts
• Mereological Correlation
– “partOf” transitivity relations
2015-05-17 ISACA Dagen 2013
Semantic Integration & Correlation
2015-05-17 ISACA Dagen 2013
Integrated Query
• Purpose-built triplestore (graph) database engine can
store the final dataset
– Up to billions of triples
• SQL-like queries against the integrated/correlated
evidence set
• Graph pattern matching
techniques
2015-05-17 ISACA Dagen 2013
A PoC Instantiation
• Evidence Manager
• Filtering / Pre-processing
• Semantic Parser
• Inference Engine
• Classification, Inverse &
Transitive Properties
• Rule & Query Engines2015-05-17 ISACA Dagen 2013
Experiment A
2015-05-17 ISACA Dagen 2013
Experiment B
2015-05-17 ISACA Dagen 2013
Sample Query
• “Is any file resident on the disk malicious and if yes where
has it been downloaded from and which ISP did the IP
belong to?”
2015-05-17 ISACA Dagen 2013
Sample Query
SELECT DISTINCT ?pathName ?uri ?ipvalue
?asnumber ?link
WHERE {
?file rdf:type digitalmedia:File .
?file digitalmedia:hasPathName ?pathName .
?file digitalmedia:hasMD5 ?md5 .
?httpbody integration:HTTPContentToMediaFile ?file .
?file integration:MediaFileToVTFile ?vtfile .
?vtfile virustotal:hasAVReport ?report .
?report virustotal:hasPermanentLink ?link .
?httpresp http:body ?httpbody .
?httpreq http:requestURI ?uri .
?httpreq http:resp ?httpresp .
?http packetcapture:hasHTTPRequest ?httpreq .
?http rdf:type packetcapture:HTTP .
?tcpflow packetcapture:hasApplicationLayerProtocol
?http .
?tcpflow packetcapture:hasDestinationIP ?destip .
?destip packetcapture:hasIPValue ?ipvalue .
?destip integration:PcapIPToWHOISIpAddr ?whoisip .
?whoisip whois:isContainedInRange ?range .
?range whois:hasRange ?rangeValue .
?range whois:isContainedInAS ?as .
?as whois:hasNetName ?netname .
?as whois:hasASNumber ?asnumber
2015-05-17 ISACA Dagen 2013
Example Hypothesies-Queries
• Have there been any unsuccessful connection attempts
from systems in the same network as the one that hosted
the malicious file?
• Which disk files have been created or accessed shortly after
the malicious file was downloaded?
• Has there been any successful connection between our
system and a known malicious host?
• Which files have been accessed shortly before the host
communicated with any blacklisted network host?
• Which websites have been visited by the user shortly
before the download of the malicious file?
2015-05-17 ISACA Dagen 2013
Summary
• Ability to represent and integrate heterogeneous data
• Supports the formulation and execution of complex queries
• Expandable (ontologies, rules, queries)
• Computational complexity depends on the ontology, rules,
amount of data
• Reliance to online data sources may affect the accuracy of
the results
2015-05-17 ISACA Dagen 2013
Future Work
• Advanced reasoning capabilities (e.g. detect
anti-forensic inconsistencies)
• Extended analysis techniques (e.g. additional
data sources, user activities)
• Large scale performance evaluation, distributed
architecture
• User-friendly graphical interface for rule/query
formulation and result navigation
2015-05-17 ISACA Dagen 2013
Thank you
2015-05-17 ISACA Dagen 2013

More Related Content

What's hot

Dataset Descriptions in Open PHACTS and HCLS
Dataset Descriptions in Open PHACTS and HCLSDataset Descriptions in Open PHACTS and HCLS
Dataset Descriptions in Open PHACTS and HCLS
Alasdair Gray
 
Linked Open Data and DANS
Linked Open Data and DANSLinked Open Data and DANS
Linked Open Data and DANS
vty
 
DataverseNL as structured data hub
DataverseNL as structured data hubDataverseNL as structured data hub
DataverseNL as structured data hub
vty
 
Fire kit ios (r-baldwin)
Fire kit ios (r-baldwin)Fire kit ios (r-baldwin)
Fire kit ios (r-baldwin)
DevDays
 
The Rhizomer Semantic Content Management System
The Rhizomer Semantic Content Management SystemThe Rhizomer Semantic Content Management System
The Rhizomer Semantic Content Management System
Roberto García
 
Distributed Tracing at UBER Scale: Creating a treasure map for your monitori...
Distributed Tracing at UBER Scale: Creating a treasure map for your monitori...Distributed Tracing at UBER Scale: Creating a treasure map for your monitori...
Distributed Tracing at UBER Scale: Creating a treasure map for your monitori...
Yuri Shkuro
 
The Reality of Digital Transfer @ArchivesNZ
The Reality of Digital Transfer @ArchivesNZThe Reality of Digital Transfer @ArchivesNZ
The Reality of Digital Transfer @ArchivesNZ
Ross Spencer
 
Extending DSpace 7: DSpace-CRIS and DSpace-GLAM for empowered repositories an...
Extending DSpace 7: DSpace-CRIS and DSpace-GLAM for empowered repositories an...Extending DSpace 7: DSpace-CRIS and DSpace-GLAM for empowered repositories an...
Extending DSpace 7: DSpace-CRIS and DSpace-GLAM for empowered repositories an...
4Science
 
Introduction of semantic technology for SAS programmers
Introduction of semantic technology for SAS programmersIntroduction of semantic technology for SAS programmers
Introduction of semantic technology for SAS programmers
Kevin Lee
 
Psicquic tutorial
Psicquic tutorialPsicquic tutorial
Psicquic tutorial
Rafael C. Jimenez
 
DSpace-CRIS: new features and contribution to the DSpace mainstream
DSpace-CRIS: new features and contribution to the DSpace mainstreamDSpace-CRIS: new features and contribution to the DSpace mainstream
DSpace-CRIS: new features and contribution to the DSpace mainstream
Andrea Bollini
 
Binary Trees? Automatically identifying the links between born-digital records
Binary Trees? Automatically identifying the links between born-digital recordsBinary Trees? Automatically identifying the links between born-digital records
Binary Trees? Automatically identifying the links between born-digital records
Ross Spencer
 
ICIC 2013 Conference Proceedings Andreas Pesenhofer max.recall
ICIC 2013 Conference Proceedings Andreas Pesenhofer max.recallICIC 2013 Conference Proceedings Andreas Pesenhofer max.recall
ICIC 2013 Conference Proceedings Andreas Pesenhofer max.recall
Dr. Haxel Consult
 
Duraspace Hot Topics Series 6: Metadata and Repository Services
Duraspace Hot Topics Series 6: Metadata and Repository ServicesDuraspace Hot Topics Series 6: Metadata and Repository Services
Duraspace Hot Topics Series 6: Metadata and Repository Services
Matthew Critchlow
 
Checksum 101
Checksum 101Checksum 101
Checksum 101
Ross Spencer
 
4Science presentes: ORCiD API Tutorial
4Science presentes: ORCiD API Tutorial4Science presentes: ORCiD API Tutorial
4Science presentes: ORCiD API Tutorial
4Science
 
New Product Introductions - FIZ Karlsruhe
New Product Introductions - FIZ KarlsruheNew Product Introductions - FIZ Karlsruhe
New Product Introductions - FIZ Karlsruhe
Dr. Haxel Consult
 
Whowas: History of resources at APNIC
Whowas: History of resources at APNICWhowas: History of resources at APNIC
Whowas: History of resources at APNIC
APNIC
 
Interoperability is the key: repositories networks promoting the quality and ...
Interoperability is the key: repositories networks promoting the quality and ...Interoperability is the key: repositories networks promoting the quality and ...
Interoperability is the key: repositories networks promoting the quality and ...
Pedro Príncipe
 

What's hot (19)

Dataset Descriptions in Open PHACTS and HCLS
Dataset Descriptions in Open PHACTS and HCLSDataset Descriptions in Open PHACTS and HCLS
Dataset Descriptions in Open PHACTS and HCLS
 
Linked Open Data and DANS
Linked Open Data and DANSLinked Open Data and DANS
Linked Open Data and DANS
 
DataverseNL as structured data hub
DataverseNL as structured data hubDataverseNL as structured data hub
DataverseNL as structured data hub
 
Fire kit ios (r-baldwin)
Fire kit ios (r-baldwin)Fire kit ios (r-baldwin)
Fire kit ios (r-baldwin)
 
The Rhizomer Semantic Content Management System
The Rhizomer Semantic Content Management SystemThe Rhizomer Semantic Content Management System
The Rhizomer Semantic Content Management System
 
Distributed Tracing at UBER Scale: Creating a treasure map for your monitori...
Distributed Tracing at UBER Scale: Creating a treasure map for your monitori...Distributed Tracing at UBER Scale: Creating a treasure map for your monitori...
Distributed Tracing at UBER Scale: Creating a treasure map for your monitori...
 
The Reality of Digital Transfer @ArchivesNZ
The Reality of Digital Transfer @ArchivesNZThe Reality of Digital Transfer @ArchivesNZ
The Reality of Digital Transfer @ArchivesNZ
 
Extending DSpace 7: DSpace-CRIS and DSpace-GLAM for empowered repositories an...
Extending DSpace 7: DSpace-CRIS and DSpace-GLAM for empowered repositories an...Extending DSpace 7: DSpace-CRIS and DSpace-GLAM for empowered repositories an...
Extending DSpace 7: DSpace-CRIS and DSpace-GLAM for empowered repositories an...
 
Introduction of semantic technology for SAS programmers
Introduction of semantic technology for SAS programmersIntroduction of semantic technology for SAS programmers
Introduction of semantic technology for SAS programmers
 
Psicquic tutorial
Psicquic tutorialPsicquic tutorial
Psicquic tutorial
 
DSpace-CRIS: new features and contribution to the DSpace mainstream
DSpace-CRIS: new features and contribution to the DSpace mainstreamDSpace-CRIS: new features and contribution to the DSpace mainstream
DSpace-CRIS: new features and contribution to the DSpace mainstream
 
Binary Trees? Automatically identifying the links between born-digital records
Binary Trees? Automatically identifying the links between born-digital recordsBinary Trees? Automatically identifying the links between born-digital records
Binary Trees? Automatically identifying the links between born-digital records
 
ICIC 2013 Conference Proceedings Andreas Pesenhofer max.recall
ICIC 2013 Conference Proceedings Andreas Pesenhofer max.recallICIC 2013 Conference Proceedings Andreas Pesenhofer max.recall
ICIC 2013 Conference Proceedings Andreas Pesenhofer max.recall
 
Duraspace Hot Topics Series 6: Metadata and Repository Services
Duraspace Hot Topics Series 6: Metadata and Repository ServicesDuraspace Hot Topics Series 6: Metadata and Repository Services
Duraspace Hot Topics Series 6: Metadata and Repository Services
 
Checksum 101
Checksum 101Checksum 101
Checksum 101
 
4Science presentes: ORCiD API Tutorial
4Science presentes: ORCiD API Tutorial4Science presentes: ORCiD API Tutorial
4Science presentes: ORCiD API Tutorial
 
New Product Introductions - FIZ Karlsruhe
New Product Introductions - FIZ KarlsruheNew Product Introductions - FIZ Karlsruhe
New Product Introductions - FIZ Karlsruhe
 
Whowas: History of resources at APNIC
Whowas: History of resources at APNICWhowas: History of resources at APNIC
Whowas: History of resources at APNIC
 
Interoperability is the key: repositories networks promoting the quality and ...
Interoperability is the key: repositories networks promoting the quality and ...Interoperability is the key: repositories networks promoting the quality and ...
Interoperability is the key: repositories networks promoting the quality and ...
 

Viewers also liked

Ppt novel gue
Ppt novel guePpt novel gue
Ppt novel gue
lailyfary
 
Overcoming Continuous Delivery Impedance
Overcoming Continuous Delivery ImpedanceOvercoming Continuous Delivery Impedance
Overcoming Continuous Delivery Impedance
Mark Rendell
 
Continuous Delivery with a PaaS Application
Continuous Delivery with a PaaS ApplicationContinuous Delivery with a PaaS Application
Continuous Delivery with a PaaS Application
Mark Rendell
 
Diktat autocad
Diktat autocadDiktat autocad
Diktat autocad
opungganteng
 
New microsoft office word document
New microsoft office word documentNew microsoft office word document
New microsoft office word document
budiristanto
 
Photon Network Course 2014
Photon Network Course 2014Photon Network Course 2014
Photon Network Course 2014
Alexander Sosnovskiy
 
Semantically-Enabled Digital Investigations - Research Overview
Semantically-Enabled Digital Investigations - Research OverviewSemantically-Enabled Digital Investigations - Research Overview
Semantically-Enabled Digital Investigations - Research Overview
inbroker
 
Neutron behind the scenes
Neutron   behind the scenesNeutron   behind the scenes
Neutron behind the scenes
inbroker
 
Ignite: When You Need A DevOps Team
Ignite: When You Need A DevOps TeamIgnite: When You Need A DevOps Team
Ignite: When You Need A DevOps Team
Mark Rendell
 
The dumb waiter
The dumb waiterThe dumb waiter
The dumb waiter
lailyfary
 
Introducing Lubbock Entertainment and Performing Arts Center
Introducing Lubbock Entertainment and Performing Arts CenterIntroducing Lubbock Entertainment and Performing Arts Center
Introducing Lubbock Entertainment and Performing Arts Center
LEPAA
 
presentation on transfer of training
presentation on transfer of trainingpresentation on transfer of training
presentation on transfer of trainingpallavi313
 
Breaking the 2 Pizza Paradox with your Platform as an Application
Breaking the 2 Pizza Paradox with your Platform as an ApplicationBreaking the 2 Pizza Paradox with your Platform as an Application
Breaking the 2 Pizza Paradox with your Platform as an Application
Mark Rendell
 
The Network Protocol Stack Revisited
The Network Protocol Stack RevisitedThe Network Protocol Stack Revisited
The Network Protocol Stack Revisited
inbroker
 
DNS Security
DNS SecurityDNS Security
DNS Security
inbroker
 
Network tunneling techniques
Network tunneling techniquesNetwork tunneling techniques
Network tunneling techniques
inbroker
 

Viewers also liked (16)

Ppt novel gue
Ppt novel guePpt novel gue
Ppt novel gue
 
Overcoming Continuous Delivery Impedance
Overcoming Continuous Delivery ImpedanceOvercoming Continuous Delivery Impedance
Overcoming Continuous Delivery Impedance
 
Continuous Delivery with a PaaS Application
Continuous Delivery with a PaaS ApplicationContinuous Delivery with a PaaS Application
Continuous Delivery with a PaaS Application
 
Diktat autocad
Diktat autocadDiktat autocad
Diktat autocad
 
New microsoft office word document
New microsoft office word documentNew microsoft office word document
New microsoft office word document
 
Photon Network Course 2014
Photon Network Course 2014Photon Network Course 2014
Photon Network Course 2014
 
Semantically-Enabled Digital Investigations - Research Overview
Semantically-Enabled Digital Investigations - Research OverviewSemantically-Enabled Digital Investigations - Research Overview
Semantically-Enabled Digital Investigations - Research Overview
 
Neutron behind the scenes
Neutron   behind the scenesNeutron   behind the scenes
Neutron behind the scenes
 
Ignite: When You Need A DevOps Team
Ignite: When You Need A DevOps TeamIgnite: When You Need A DevOps Team
Ignite: When You Need A DevOps Team
 
The dumb waiter
The dumb waiterThe dumb waiter
The dumb waiter
 
Introducing Lubbock Entertainment and Performing Arts Center
Introducing Lubbock Entertainment and Performing Arts CenterIntroducing Lubbock Entertainment and Performing Arts Center
Introducing Lubbock Entertainment and Performing Arts Center
 
presentation on transfer of training
presentation on transfer of trainingpresentation on transfer of training
presentation on transfer of training
 
Breaking the 2 Pizza Paradox with your Platform as an Application
Breaking the 2 Pizza Paradox with your Platform as an ApplicationBreaking the 2 Pizza Paradox with your Platform as an Application
Breaking the 2 Pizza Paradox with your Platform as an Application
 
The Network Protocol Stack Revisited
The Network Protocol Stack RevisitedThe Network Protocol Stack Revisited
The Network Protocol Stack Revisited
 
DNS Security
DNS SecurityDNS Security
DNS Security
 
Network tunneling techniques
Network tunneling techniquesNetwork tunneling techniques
Network tunneling techniques
 

Similar to Semantically-Enabled Digital Investigations

FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout
Carole Goble
 
State of Florida Neo4j Graph Briefing - Cyber IAM
State of Florida Neo4j Graph Briefing - Cyber IAMState of Florida Neo4j Graph Briefing - Cyber IAM
State of Florida Neo4j Graph Briefing - Cyber IAM
Neo4j
 
Pushing Chemical Biology Through the Pipes
Pushing Chemical Biology Through the PipesPushing Chemical Biology Through the Pipes
Pushing Chemical Biology Through the PipesRajarshi Guha
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
Marin Dimitrov
 
Linked services: Connecting services to the Web of Data
Linked services: Connecting services to the Web of DataLinked services: Connecting services to the Web of Data
Linked services: Connecting services to the Web of Data
John Domingue
 
Lawless-3-jun15
Lawless-3-jun15Lawless-3-jun15
Telco analytics at scale
Telco analytics at scaleTelco analytics at scale
Telco analytics at scale
datamantra
 
CLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage informationCLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage information
Enno Meijers
 
Data governance datalakes_multitenancy
Data governance datalakes_multitenancyData governance datalakes_multitenancy
Data governance datalakes_multitenancy
Sathish K S
 
2015 05-07-mac
2015 05-07-mac2015 05-07-mac
Crossing Analytics Systems: Case for Integrated Provenance in Data Lakes
Crossing Analytics Systems: Case for Integrated Provenance in Data LakesCrossing Analytics Systems: Case for Integrated Provenance in Data Lakes
Crossing Analytics Systems: Case for Integrated Provenance in Data Lakes
Isuru Suriarachchi
 
Southwickc lampert lodlam_training
Southwickc lampert lodlam_trainingSouthwickc lampert lodlam_training
Southwickc lampert lodlam_training
ssouthwick
 
FAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM data management support for ERACoBioTech ProposalsFAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM
 
WOTS2E: A Search Engine for a Semantic Web of Things
WOTS2E: A Search Engine for a Semantic Web of ThingsWOTS2E: A Search Engine for a Semantic Web of Things
WOTS2E: A Search Engine for a Semantic Web of Things
Andreas Kamilaris
 
Hypermedia for Machine APIs
Hypermedia for Machine APIsHypermedia for Machine APIs
Hypermedia for Machine APIs
Michael Koster
 
Caplan and York, 'What It Takes To Make It Last: E-Resources Preservation"
Caplan and York, 'What It Takes To Make It Last:  E-Resources Preservation"Caplan and York, 'What It Takes To Make It Last:  E-Resources Preservation"
Caplan and York, 'What It Takes To Make It Last: E-Resources Preservation"
National Information Standards Organization (NISO)
 
How e-infrastructure can contribute to Linked Germplasm Data
How e-infrastructure can contribute to Linked Germplasm DataHow e-infrastructure can contribute to Linked Germplasm Data
How e-infrastructure can contribute to Linked Germplasm DataStoitsis Giannis
 
A collaborative approach to "filling the digital preservation gap" for Resear...
A collaborative approach to "filling the digital preservation gap" for Resear...A collaborative approach to "filling the digital preservation gap" for Resear...
A collaborative approach to "filling the digital preservation gap" for Resear...
Jenny Mitcham
 

Similar to Semantically-Enabled Digital Investigations (20)

FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout
 
State of Florida Neo4j Graph Briefing - Cyber IAM
State of Florida Neo4j Graph Briefing - Cyber IAMState of Florida Neo4j Graph Briefing - Cyber IAM
State of Florida Neo4j Graph Briefing - Cyber IAM
 
Pushing Chemical Biology Through the Pipes
Pushing Chemical Biology Through the PipesPushing Chemical Biology Through the Pipes
Pushing Chemical Biology Through the Pipes
 
Presentation 16 may keynote karin bredenberg
Presentation 16 may keynote karin bredenbergPresentation 16 may keynote karin bredenberg
Presentation 16 may keynote karin bredenberg
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
 
Design patternsforiot
Design patternsforiotDesign patternsforiot
Design patternsforiot
 
Linked services: Connecting services to the Web of Data
Linked services: Connecting services to the Web of DataLinked services: Connecting services to the Web of Data
Linked services: Connecting services to the Web of Data
 
Lawless-3-jun15
Lawless-3-jun15Lawless-3-jun15
Lawless-3-jun15
 
Telco analytics at scale
Telco analytics at scaleTelco analytics at scale
Telco analytics at scale
 
CLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage informationCLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage information
 
Data governance datalakes_multitenancy
Data governance datalakes_multitenancyData governance datalakes_multitenancy
Data governance datalakes_multitenancy
 
2015 05-07-mac
2015 05-07-mac2015 05-07-mac
2015 05-07-mac
 
Crossing Analytics Systems: Case for Integrated Provenance in Data Lakes
Crossing Analytics Systems: Case for Integrated Provenance in Data LakesCrossing Analytics Systems: Case for Integrated Provenance in Data Lakes
Crossing Analytics Systems: Case for Integrated Provenance in Data Lakes
 
Southwickc lampert lodlam_training
Southwickc lampert lodlam_trainingSouthwickc lampert lodlam_training
Southwickc lampert lodlam_training
 
FAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM data management support for ERACoBioTech ProposalsFAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM data management support for ERACoBioTech Proposals
 
WOTS2E: A Search Engine for a Semantic Web of Things
WOTS2E: A Search Engine for a Semantic Web of ThingsWOTS2E: A Search Engine for a Semantic Web of Things
WOTS2E: A Search Engine for a Semantic Web of Things
 
Hypermedia for Machine APIs
Hypermedia for Machine APIsHypermedia for Machine APIs
Hypermedia for Machine APIs
 
Caplan and York, 'What It Takes To Make It Last: E-Resources Preservation"
Caplan and York, 'What It Takes To Make It Last:  E-Resources Preservation"Caplan and York, 'What It Takes To Make It Last:  E-Resources Preservation"
Caplan and York, 'What It Takes To Make It Last: E-Resources Preservation"
 
How e-infrastructure can contribute to Linked Germplasm Data
How e-infrastructure can contribute to Linked Germplasm DataHow e-infrastructure can contribute to Linked Germplasm Data
How e-infrastructure can contribute to Linked Germplasm Data
 
A collaborative approach to "filling the digital preservation gap" for Resear...
A collaborative approach to "filling the digital preservation gap" for Resear...A collaborative approach to "filling the digital preservation gap" for Resear...
A collaborative approach to "filling the digital preservation gap" for Resear...
 

Recently uploaded

GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
ThomasParaiso2
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 

Recently uploaded (20)

GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 

Semantically-Enabled Digital Investigations

  • 2. Outline • Problem • Background • Developed Method • Demonstration • Conclusions 2015-05-17 ISACA Dagen 2013
  • 3. Problem Area • Complex attacks against networked systems • Multiple data sources of possible evidentiary value – Volume & Variety – ”looking for a needle in a stack of needles” – Paul Pillar, CIA CoA • Analysis of the collected digital data – Least formalized process step – Rely on investigators’ expertise and experience 2015-05-17 ISACA Dagen 2013
  • 4. Digital Evidence / Investigations • Reliable digital data that support hypothesizing about a security incident • Sound methods for collecting and interpreting digital data • Reconstruct events found to be criminal (DF) • Investigate and learn from information security breaches (IR)2015-05-17 ISACA Dagen 2013
  • 5. Forensic Tools • Interpreters between data abstraction layers – e.g. Reconstruct raw disk data into filesystem hierarchy and objects (files, directories) • Evidence- but not investigation- centric design • Limited tool interoperability – Manual integration of tool findings – Multiple (proprietary, undocumented) data formats/models 2015-05-17 ISACA Dagen 2013
  • 6. A Digital Investigation Example ISACA Dagen 20132015-05-17
  • 7. Semantic Web & Linked Data Technologies • ”… information is given well-defined meaning, better enabling computers and people to work in cooperation” – (Tim Berners Lee, 2001) • Ontology – ”explicit and formal specification of a conceptualization” – Entities, attributes, relationships • Metadata - Context-based or domain- specific annotation of data • Reason and inference of implicit facts 2015-05-17 ISACA Dagen 2013
  • 8. Semantic Web Architecture • URI/IRI enables global data object identification • XML provides a machine readable, validatable data encoding scheme • RDF(S) is a metadata data model and knowledge representation language – Subject-Property-Object/Value statements – Class and Property hierarchies • OWL 2 is a more expressive KR language for specifying ontologies – Restrictions, Equivalence, Cardinality, Property Chains • Rule and RDF-query languages 2015-05-17 ISACA Dagen 2013
  • 9. Method Overview 2015-05-17 ISACA Dagen 2013 Data Collection Semantic Representation Ontological Reasoning Rule-based Reasoning Integrated Query
  • 10. Domain Ontologies • Introduced a set of lightweight domain-specific OWL ontologies – Storage Media – Network Traffic – Windows Firewall Log, WHOIS RIR DB – Malicious Networks Reputation List – Malware Detection 2015-05-17 ISACA Dagen 2013
  • 12. Semantic Representation • Resource Unique Identification Scheme • Parsing tools able to process each source type with respect to the domain ontology 2015-05-17 ISACA Dagen 2013
  • 13. Evidence Integration • Automated linking among (homo/hetero-)geneous evidence sources based on key properties & matching rules 2015-05-17 ISACA Dagen 2013
  • 14. Evidence Correlation • Link instances of dissimilar type across a shared domain • Temporal Correlation – Rules for establishing time instant & interval relations among recovered artifacts • Mereological Correlation – “partOf” transitivity relations 2015-05-17 ISACA Dagen 2013
  • 15. Semantic Integration & Correlation 2015-05-17 ISACA Dagen 2013
  • 16. Integrated Query • Purpose-built triplestore (graph) database engine can store the final dataset – Up to billions of triples • SQL-like queries against the integrated/correlated evidence set • Graph pattern matching techniques 2015-05-17 ISACA Dagen 2013
  • 17. A PoC Instantiation • Evidence Manager • Filtering / Pre-processing • Semantic Parser • Inference Engine • Classification, Inverse & Transitive Properties • Rule & Query Engines2015-05-17 ISACA Dagen 2013
  • 20. Sample Query • “Is any file resident on the disk malicious and if yes where has it been downloaded from and which ISP did the IP belong to?” 2015-05-17 ISACA Dagen 2013
  • 21. Sample Query SELECT DISTINCT ?pathName ?uri ?ipvalue ?asnumber ?link WHERE { ?file rdf:type digitalmedia:File . ?file digitalmedia:hasPathName ?pathName . ?file digitalmedia:hasMD5 ?md5 . ?httpbody integration:HTTPContentToMediaFile ?file . ?file integration:MediaFileToVTFile ?vtfile . ?vtfile virustotal:hasAVReport ?report . ?report virustotal:hasPermanentLink ?link . ?httpresp http:body ?httpbody . ?httpreq http:requestURI ?uri . ?httpreq http:resp ?httpresp . ?http packetcapture:hasHTTPRequest ?httpreq . ?http rdf:type packetcapture:HTTP . ?tcpflow packetcapture:hasApplicationLayerProtocol ?http . ?tcpflow packetcapture:hasDestinationIP ?destip . ?destip packetcapture:hasIPValue ?ipvalue . ?destip integration:PcapIPToWHOISIpAddr ?whoisip . ?whoisip whois:isContainedInRange ?range . ?range whois:hasRange ?rangeValue . ?range whois:isContainedInAS ?as . ?as whois:hasNetName ?netname . ?as whois:hasASNumber ?asnumber 2015-05-17 ISACA Dagen 2013
  • 22. Example Hypothesies-Queries • Have there been any unsuccessful connection attempts from systems in the same network as the one that hosted the malicious file? • Which disk files have been created or accessed shortly after the malicious file was downloaded? • Has there been any successful connection between our system and a known malicious host? • Which files have been accessed shortly before the host communicated with any blacklisted network host? • Which websites have been visited by the user shortly before the download of the malicious file? 2015-05-17 ISACA Dagen 2013
  • 23. Summary • Ability to represent and integrate heterogeneous data • Supports the formulation and execution of complex queries • Expandable (ontologies, rules, queries) • Computational complexity depends on the ontology, rules, amount of data • Reliance to online data sources may affect the accuracy of the results 2015-05-17 ISACA Dagen 2013
  • 24. Future Work • Advanced reasoning capabilities (e.g. detect anti-forensic inconsistencies) • Extended analysis techniques (e.g. additional data sources, user activities) • Large scale performance evaluation, distributed architecture • User-friendly graphical interface for rule/query formulation and result navigation 2015-05-17 ISACA Dagen 2013