SlideShare a Scribd company logo
MattockFS
Computer-Forensics File-System
A family tree
● 2002: OCFA Anycast
● 2006: CarvFS
● 2006: Sealed Digital Evidence Bag
● 2008: MinorFS
The OCFA Anycast
● Message bus solution
● Based on the workers concept
● Part of Open Computer forensics Architecture
● Persistent Priority Queues
● Virtually infinite size queues
Issues with AnyCast
● Infinite size queues
● Multiple tools in tool-chain access same data
● Result: Many page-cache misses
Spurious reads in OCFA
● Page-cache misses
● Early hashing and Content Addressed Storage
● No locality of data design
● Server-side data entry
CarvFS
● Storage requirements traditional file carving
● CarvFS allows for zero-storage carving
● Carved files not copied our but designated
● CarvPath designations
– /mnt/carvfs/mp3/18400+4096_S4096_47912+975.crv
Issues with CarvFS
● Read-only access to forensic disk image
● In large cases hundreds of mounted image files
● OCFA hacks
– Bypass CarvFS to write to underlying growing
archive
– Inefficient hybrid CarvFS/CAS storage
Shared mutable data
● Meta-data
● Evidence data
● Derived evidence data
● Provenance logs
Shared mutable data
● Meta-data
● Evidence data
● Derived evidence data
● Provenance logs
● Anti forensics ?
● Forensic process integrity ?
Integrity measures
● Trusted immutable data and meta-data
– Lab side sealed digital evidence bags.
● Trusted provenance logging
● Capability based access control
– Sparse capabilities
– Mandatory Access Controls
Spurious reads measures
● Opportunistic hashing
● Linux fadvise API
● Integration of message-bus and data-archive
● Priority queues → Sortable sets
● Provide pressure data for throttling purposes
MattockFS
● User space file-system for forensic frameworks
● Zero-storage carving
● Hybrid data-archive/message-bus solution
● Helps minimize spurious read induced performance
degradation.
● Provides write-once (meta) data
● Provides safe capability based API
● Provides trusted provenance logs
MattockFS: Hooks
● Hooks for forensic frameworks
– Framework based routing functionality
– Pressure data for throttling new-data input
– Queue size for throttling
● Hooks for multi-server setup
– Geared to locality of data
– Migrating CPU intensive jobs
MattockFS
Combining a zero-storage carving-capable high-
integrity data-storage archive with trusted forensic
logging, a capability secure API and a local page-
cache friendly message bus solution.

More Related Content

What's hot

HPCS16 - Frederick Lefebvre - Bridging the last mile
HPCS16 - Frederick Lefebvre - Bridging the last mileHPCS16 - Frederick Lefebvre - Bridging the last mile
HPCS16 - Frederick Lefebvre - Bridging the last mileFrédérick Lefebvre
 
YDAL Barcelona
YDAL BarcelonaYDAL Barcelona
YDAL Barcelona
Gluster.org
 
Enabling Presto Caching at Uber with Alluxio
Enabling Presto Caching at Uber with AlluxioEnabling Presto Caching at Uber with Alluxio
Enabling Presto Caching at Uber with Alluxio
Alluxio, Inc.
 
Gluster for sysadmins
Gluster for sysadminsGluster for sysadmins
Gluster for sysadmins
Gluster.org
 
Tiering barcelona
Tiering barcelonaTiering barcelona
Tiering barcelona
Gluster.org
 
Distributed Timeseries Database In Go (gophercon India 17)
Distributed Timeseries Database In Go (gophercon India 17)Distributed Timeseries Database In Go (gophercon India 17)
Distributed Timeseries Database In Go (gophercon India 17)
Matthew Campbell
 
Data engineering Stl Big Data IDEA user group
Data engineering   Stl Big Data IDEA user groupData engineering   Stl Big Data IDEA user group
Data engineering Stl Big Data IDEA user group
Adam Doyle
 
Towards Data Operations
Towards Data OperationsTowards Data Operations
Towards Data Operations
Andrea Monacchi
 
2011 IBM-KNAW Cambridge - How to store meaningful bits permanently
2011 IBM-KNAW Cambridge - How to store meaningful bits permanently2011 IBM-KNAW Cambridge - How to store meaningful bits permanently
2011 IBM-KNAW Cambridge - How to store meaningful bits permanently
Dirk Roorda
 
Introducing gluster filesystem by aditya
Introducing gluster filesystem by adityaIntroducing gluster filesystem by aditya
Introducing gluster filesystem by aditya
Aditya Chhikara
 
CEPH DAY BERLIN - WELCOME
CEPH DAY BERLIN - WELCOME CEPH DAY BERLIN - WELCOME
CEPH DAY BERLIN - WELCOME
Ceph Community
 
Lisa 2015-gluster fs-introduction
Lisa 2015-gluster fs-introductionLisa 2015-gluster fs-introduction
Lisa 2015-gluster fs-introduction
Gluster.org
 
GlusterFs Architecture & Roadmap - LinuxCon EU 2013
GlusterFs Architecture & Roadmap - LinuxCon EU 2013GlusterFs Architecture & Roadmap - LinuxCon EU 2013
GlusterFs Architecture & Roadmap - LinuxCon EU 2013
Gluster.org
 
EKAW - Publishing with Triple Pattern Fragments
EKAW - Publishing with Triple Pattern FragmentsEKAW - Publishing with Triple Pattern Fragments
EKAW - Publishing with Triple Pattern Fragments
Ruben Taelman
 
Glusterfs and Hadoop
Glusterfs and HadoopGlusterfs and Hadoop
Glusterfs and Hadoop
Shubhendu Tripathi
 
Elasticsearch avoiding hotspots
Elasticsearch  avoiding hotspotsElasticsearch  avoiding hotspots
Elasticsearch avoiding hotspots
Christophe Marchal
 
Improve Presto Architectural Decisions with Shadow Cache
 Improve Presto Architectural Decisions with Shadow Cache Improve Presto Architectural Decisions with Shadow Cache
Improve Presto Architectural Decisions with Shadow Cache
Alluxio, Inc.
 
MongoDB SF Ruby
MongoDB SF RubyMongoDB SF Ruby
MongoDB SF Ruby
Mike Dirolf
 
Handling the growth of data
Handling the growth of dataHandling the growth of data
Handling the growth of data
Piyush Katariya
 
GlusterFS Talk for CentOS Dojo Bangalore
GlusterFS Talk for CentOS Dojo BangaloreGlusterFS Talk for CentOS Dojo Bangalore
GlusterFS Talk for CentOS Dojo Bangalore
Raghavendra Talur
 

What's hot (20)

HPCS16 - Frederick Lefebvre - Bridging the last mile
HPCS16 - Frederick Lefebvre - Bridging the last mileHPCS16 - Frederick Lefebvre - Bridging the last mile
HPCS16 - Frederick Lefebvre - Bridging the last mile
 
YDAL Barcelona
YDAL BarcelonaYDAL Barcelona
YDAL Barcelona
 
Enabling Presto Caching at Uber with Alluxio
Enabling Presto Caching at Uber with AlluxioEnabling Presto Caching at Uber with Alluxio
Enabling Presto Caching at Uber with Alluxio
 
Gluster for sysadmins
Gluster for sysadminsGluster for sysadmins
Gluster for sysadmins
 
Tiering barcelona
Tiering barcelonaTiering barcelona
Tiering barcelona
 
Distributed Timeseries Database In Go (gophercon India 17)
Distributed Timeseries Database In Go (gophercon India 17)Distributed Timeseries Database In Go (gophercon India 17)
Distributed Timeseries Database In Go (gophercon India 17)
 
Data engineering Stl Big Data IDEA user group
Data engineering   Stl Big Data IDEA user groupData engineering   Stl Big Data IDEA user group
Data engineering Stl Big Data IDEA user group
 
Towards Data Operations
Towards Data OperationsTowards Data Operations
Towards Data Operations
 
2011 IBM-KNAW Cambridge - How to store meaningful bits permanently
2011 IBM-KNAW Cambridge - How to store meaningful bits permanently2011 IBM-KNAW Cambridge - How to store meaningful bits permanently
2011 IBM-KNAW Cambridge - How to store meaningful bits permanently
 
Introducing gluster filesystem by aditya
Introducing gluster filesystem by adityaIntroducing gluster filesystem by aditya
Introducing gluster filesystem by aditya
 
CEPH DAY BERLIN - WELCOME
CEPH DAY BERLIN - WELCOME CEPH DAY BERLIN - WELCOME
CEPH DAY BERLIN - WELCOME
 
Lisa 2015-gluster fs-introduction
Lisa 2015-gluster fs-introductionLisa 2015-gluster fs-introduction
Lisa 2015-gluster fs-introduction
 
GlusterFs Architecture & Roadmap - LinuxCon EU 2013
GlusterFs Architecture & Roadmap - LinuxCon EU 2013GlusterFs Architecture & Roadmap - LinuxCon EU 2013
GlusterFs Architecture & Roadmap - LinuxCon EU 2013
 
EKAW - Publishing with Triple Pattern Fragments
EKAW - Publishing with Triple Pattern FragmentsEKAW - Publishing with Triple Pattern Fragments
EKAW - Publishing with Triple Pattern Fragments
 
Glusterfs and Hadoop
Glusterfs and HadoopGlusterfs and Hadoop
Glusterfs and Hadoop
 
Elasticsearch avoiding hotspots
Elasticsearch  avoiding hotspotsElasticsearch  avoiding hotspots
Elasticsearch avoiding hotspots
 
Improve Presto Architectural Decisions with Shadow Cache
 Improve Presto Architectural Decisions with Shadow Cache Improve Presto Architectural Decisions with Shadow Cache
Improve Presto Architectural Decisions with Shadow Cache
 
MongoDB SF Ruby
MongoDB SF RubyMongoDB SF Ruby
MongoDB SF Ruby
 
Handling the growth of data
Handling the growth of dataHandling the growth of data
Handling the growth of data
 
GlusterFS Talk for CentOS Dojo Bangalore
GlusterFS Talk for CentOS Dojo BangaloreGlusterFS Talk for CentOS Dojo Bangalore
GlusterFS Talk for CentOS Dojo Bangalore
 

Viewers also liked

Už se vám někdy stalo?
Už se vám někdy stalo?Už se vám někdy stalo?
Už se vám někdy stalo?Zdeněk Klusák
 
The design of forensic computer workstations
The design of forensic computer workstationsThe design of forensic computer workstations
The design of forensic computer workstations
jkvr100
 
Computer forensic ppt
Computer forensic pptComputer forensic ppt
Computer forensic ppt
Onkar1431
 
Electornic evidence collection
Electornic evidence collectionElectornic evidence collection
Electornic evidence collectionFakrul Alam
 
Capturing forensics image
Capturing forensics imageCapturing forensics image
Capturing forensics image
Chris Harrington
 
Forensic Science - 01 What is forensic science?
Forensic Science - 01 What is forensic science?Forensic Science - 01 What is forensic science?
Forensic Science - 01 What is forensic science?
Ian Anderson
 
Elements Of Forensic Science
Elements Of Forensic ScienceElements Of Forensic Science
Elements Of Forensic Scienceannperry09
 
Computer forensic
Computer forensicComputer forensic
Computer forensicbhavithd
 
BoyarMiller - You Lost Me At Gigabyte: Working with Computer Forensic Examiners
BoyarMiller - You Lost Me At Gigabyte: Working with Computer Forensic ExaminersBoyarMiller - You Lost Me At Gigabyte: Working with Computer Forensic Examiners
BoyarMiller - You Lost Me At Gigabyte: Working with Computer Forensic Examiners
BoyarMiller
 
Computer Forensics in Fighting Crimes
Computer Forensics in Fighting CrimesComputer Forensics in Fighting Crimes
Computer Forensics in Fighting CrimesIsaiah Edem
 
Business Intelligence (BI) Tools For Computer Forensic
Business Intelligence (BI) Tools For Computer ForensicBusiness Intelligence (BI) Tools For Computer Forensic
Business Intelligence (BI) Tools For Computer Forensic
Dhiren Gala
 
Document clustering for forensic analysis an approach for improving compute...
Document clustering for forensic   analysis an approach for improving compute...Document clustering for forensic   analysis an approach for improving compute...
Document clustering for forensic analysis an approach for improving compute...
Madan Golla
 
7 Forensic Science Powerpoint Chapter 07 Forensic Anthropology
7  Forensic Science Powerpoint Chapter 07 Forensic Anthropology7  Forensic Science Powerpoint Chapter 07 Forensic Anthropology
7 Forensic Science Powerpoint Chapter 07 Forensic AnthropologyGrossmont College
 
Introduction to computer forensic
Introduction to computer forensicIntroduction to computer forensic
Introduction to computer forensic
Online
 
Computer forensic 101 - OWASP Khartoum
Computer forensic 101 - OWASP KhartoumComputer forensic 101 - OWASP Khartoum
Computer forensic 101 - OWASP Khartoum
OWASP Khartoum
 
Introduction to forensic science
Introduction to forensic scienceIntroduction to forensic science
Introduction to forensic science
Abby Varghese
 
Digital Evidence in Computer Forensic Investigations
Digital Evidence in Computer Forensic InvestigationsDigital Evidence in Computer Forensic Investigations
Digital Evidence in Computer Forensic Investigations
Filip Maertens
 
Principles of forensic science
Principles of forensic sciencePrinciples of forensic science
Principles of forensic science
Bhoopendra Singh
 

Viewers also liked (20)

Už se vám někdy stalo?
Už se vám někdy stalo?Už se vám někdy stalo?
Už se vám někdy stalo?
 
The design of forensic computer workstations
The design of forensic computer workstationsThe design of forensic computer workstations
The design of forensic computer workstations
 
Computer forensic ppt
Computer forensic pptComputer forensic ppt
Computer forensic ppt
 
Electornic evidence collection
Electornic evidence collectionElectornic evidence collection
Electornic evidence collection
 
Capturing forensics image
Capturing forensics imageCapturing forensics image
Capturing forensics image
 
File000173
File000173File000173
File000173
 
Forensic Science - 01 What is forensic science?
Forensic Science - 01 What is forensic science?Forensic Science - 01 What is forensic science?
Forensic Science - 01 What is forensic science?
 
Elements Of Forensic Science
Elements Of Forensic ScienceElements Of Forensic Science
Elements Of Forensic Science
 
Computer forensic
Computer forensicComputer forensic
Computer forensic
 
BoyarMiller - You Lost Me At Gigabyte: Working with Computer Forensic Examiners
BoyarMiller - You Lost Me At Gigabyte: Working with Computer Forensic ExaminersBoyarMiller - You Lost Me At Gigabyte: Working with Computer Forensic Examiners
BoyarMiller - You Lost Me At Gigabyte: Working with Computer Forensic Examiners
 
Computer forensic
Computer forensicComputer forensic
Computer forensic
 
Computer Forensics in Fighting Crimes
Computer Forensics in Fighting CrimesComputer Forensics in Fighting Crimes
Computer Forensics in Fighting Crimes
 
Business Intelligence (BI) Tools For Computer Forensic
Business Intelligence (BI) Tools For Computer ForensicBusiness Intelligence (BI) Tools For Computer Forensic
Business Intelligence (BI) Tools For Computer Forensic
 
Document clustering for forensic analysis an approach for improving compute...
Document clustering for forensic   analysis an approach for improving compute...Document clustering for forensic   analysis an approach for improving compute...
Document clustering for forensic analysis an approach for improving compute...
 
7 Forensic Science Powerpoint Chapter 07 Forensic Anthropology
7  Forensic Science Powerpoint Chapter 07 Forensic Anthropology7  Forensic Science Powerpoint Chapter 07 Forensic Anthropology
7 Forensic Science Powerpoint Chapter 07 Forensic Anthropology
 
Introduction to computer forensic
Introduction to computer forensicIntroduction to computer forensic
Introduction to computer forensic
 
Computer forensic 101 - OWASP Khartoum
Computer forensic 101 - OWASP KhartoumComputer forensic 101 - OWASP Khartoum
Computer forensic 101 - OWASP Khartoum
 
Introduction to forensic science
Introduction to forensic scienceIntroduction to forensic science
Introduction to forensic science
 
Digital Evidence in Computer Forensic Investigations
Digital Evidence in Computer Forensic InvestigationsDigital Evidence in Computer Forensic Investigations
Digital Evidence in Computer Forensic Investigations
 
Principles of forensic science
Principles of forensic sciencePrinciples of forensic science
Principles of forensic science
 

Similar to MattockFS Computer Forensic File-System

Log ingestion kafka -- impala using apex
Log ingestion   kafka -- impala using apexLog ingestion   kafka -- impala using apex
Log ingestion kafka -- impala using apex
Apache Apex
 
Ursa Labs and Apache Arrow in 2019
Ursa Labs and Apache Arrow in 2019Ursa Labs and Apache Arrow in 2019
Ursa Labs and Apache Arrow in 2019
Wes McKinney
 
Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...
Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...
Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...
inside-BigData.com
 
Architecture of a Next-Generation Parallel File System
Architecture of a Next-Generation Parallel File System	Architecture of a Next-Generation Parallel File System
Architecture of a Next-Generation Parallel File System
Great Wide Open
 
Merging heterogeneous network measurement data
Merging heterogeneous network measurement dataMerging heterogeneous network measurement data
Merging heterogeneous network measurement data
Jorge E. López de Vergara Méndez
 
Apache Arrow -- Cross-language development platform for in-memory data
Apache Arrow -- Cross-language development platform for in-memory dataApache Arrow -- Cross-language development platform for in-memory data
Apache Arrow -- Cross-language development platform for in-memory data
Wes McKinney
 
Supercharging Data Performance for Real-Time Data Analysis
Supercharging Data Performance for Real-Time Data Analysis Supercharging Data Performance for Real-Time Data Analysis
Supercharging Data Performance for Real-Time Data Analysis
Ryft
 
Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?
Anton Nazaruk
 
Big Data Architecture Workshop - Vahid Amiri
Big Data Architecture Workshop -  Vahid AmiriBig Data Architecture Workshop -  Vahid Amiri
Big Data Architecture Workshop - Vahid Amiri
datastack
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File Systemelliando dias
 
Hadoop
HadoopHadoop
Datasheet hus-and-hitachi-nas-platform-4000-series
Datasheet hus-and-hitachi-nas-platform-4000-seriesDatasheet hus-and-hitachi-nas-platform-4000-series
Datasheet hus-and-hitachi-nas-platform-4000-seriesJeff Yana
 
Drill at the Chicago Hug
Drill at the Chicago HugDrill at the Chicago Hug
Drill at the Chicago Hug
MapR Technologies
 
SQL and Machine Learning on Hadoop
SQL and Machine Learning on HadoopSQL and Machine Learning on Hadoop
SQL and Machine Learning on Hadoop
Mukund Babbar
 
Hadoop ppt1
Hadoop ppt1Hadoop ppt1
Hadoop ppt1
chariorienit
 
Storage solutions for High Performance Computing
Storage solutions for High Performance ComputingStorage solutions for High Performance Computing
Storage solutions for High Performance Computing
gmateesc
 
Skyhook: Towards an Arrow-Native Storage System, CCGrid 2022
Skyhook: Towards an Arrow-Native Storage System, CCGrid 2022Skyhook: Towards an Arrow-Native Storage System, CCGrid 2022
Skyhook: Towards an Arrow-Native Storage System, CCGrid 2022
JayjeetChakraborty
 
Hadoop storage
Hadoop storageHadoop storage
Hadoop storage
SanSan149
 
Apache Spark 101 - Demi Ben-Ari
Apache Spark 101 - Demi Ben-AriApache Spark 101 - Demi Ben-Ari
Apache Spark 101 - Demi Ben-Ari
Demi Ben-Ari
 
Interactive Data Analysis in Spark Streaming
Interactive Data Analysis in Spark StreamingInteractive Data Analysis in Spark Streaming
Interactive Data Analysis in Spark Streaming
datamantra
 

Similar to MattockFS Computer Forensic File-System (20)

Log ingestion kafka -- impala using apex
Log ingestion   kafka -- impala using apexLog ingestion   kafka -- impala using apex
Log ingestion kafka -- impala using apex
 
Ursa Labs and Apache Arrow in 2019
Ursa Labs and Apache Arrow in 2019Ursa Labs and Apache Arrow in 2019
Ursa Labs and Apache Arrow in 2019
 
Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...
Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...
Big Data Meets HPC - Exploiting HPC Technologies for Accelerating Big Data Pr...
 
Architecture of a Next-Generation Parallel File System
Architecture of a Next-Generation Parallel File System	Architecture of a Next-Generation Parallel File System
Architecture of a Next-Generation Parallel File System
 
Merging heterogeneous network measurement data
Merging heterogeneous network measurement dataMerging heterogeneous network measurement data
Merging heterogeneous network measurement data
 
Apache Arrow -- Cross-language development platform for in-memory data
Apache Arrow -- Cross-language development platform for in-memory dataApache Arrow -- Cross-language development platform for in-memory data
Apache Arrow -- Cross-language development platform for in-memory data
 
Supercharging Data Performance for Real-Time Data Analysis
Supercharging Data Performance for Real-Time Data Analysis Supercharging Data Performance for Real-Time Data Analysis
Supercharging Data Performance for Real-Time Data Analysis
 
Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?
 
Big Data Architecture Workshop - Vahid Amiri
Big Data Architecture Workshop -  Vahid AmiriBig Data Architecture Workshop -  Vahid Amiri
Big Data Architecture Workshop - Vahid Amiri
 
Hadoop Distributed File System
Hadoop Distributed File SystemHadoop Distributed File System
Hadoop Distributed File System
 
Hadoop
HadoopHadoop
Hadoop
 
Datasheet hus-and-hitachi-nas-platform-4000-series
Datasheet hus-and-hitachi-nas-platform-4000-seriesDatasheet hus-and-hitachi-nas-platform-4000-series
Datasheet hus-and-hitachi-nas-platform-4000-series
 
Drill at the Chicago Hug
Drill at the Chicago HugDrill at the Chicago Hug
Drill at the Chicago Hug
 
SQL and Machine Learning on Hadoop
SQL and Machine Learning on HadoopSQL and Machine Learning on Hadoop
SQL and Machine Learning on Hadoop
 
Hadoop ppt1
Hadoop ppt1Hadoop ppt1
Hadoop ppt1
 
Storage solutions for High Performance Computing
Storage solutions for High Performance ComputingStorage solutions for High Performance Computing
Storage solutions for High Performance Computing
 
Skyhook: Towards an Arrow-Native Storage System, CCGrid 2022
Skyhook: Towards an Arrow-Native Storage System, CCGrid 2022Skyhook: Towards an Arrow-Native Storage System, CCGrid 2022
Skyhook: Towards an Arrow-Native Storage System, CCGrid 2022
 
Hadoop storage
Hadoop storageHadoop storage
Hadoop storage
 
Apache Spark 101 - Demi Ben-Ari
Apache Spark 101 - Demi Ben-AriApache Spark 101 - Demi Ben-Ari
Apache Spark 101 - Demi Ben-Ari
 
Interactive Data Analysis in Spark Streaming
Interactive Data Analysis in Spark StreamingInteractive Data Analysis in Spark Streaming
Interactive Data Analysis in Spark Streaming
 

Recently uploaded

Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
g2nightmarescribd
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 

Recently uploaded (20)

Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 

MattockFS Computer Forensic File-System

  • 2. A family tree ● 2002: OCFA Anycast ● 2006: CarvFS ● 2006: Sealed Digital Evidence Bag ● 2008: MinorFS
  • 3. The OCFA Anycast ● Message bus solution ● Based on the workers concept ● Part of Open Computer forensics Architecture ● Persistent Priority Queues ● Virtually infinite size queues
  • 4. Issues with AnyCast ● Infinite size queues ● Multiple tools in tool-chain access same data ● Result: Many page-cache misses
  • 5. Spurious reads in OCFA ● Page-cache misses ● Early hashing and Content Addressed Storage ● No locality of data design ● Server-side data entry
  • 6. CarvFS ● Storage requirements traditional file carving ● CarvFS allows for zero-storage carving ● Carved files not copied our but designated ● CarvPath designations – /mnt/carvfs/mp3/18400+4096_S4096_47912+975.crv
  • 7. Issues with CarvFS ● Read-only access to forensic disk image ● In large cases hundreds of mounted image files ● OCFA hacks – Bypass CarvFS to write to underlying growing archive – Inefficient hybrid CarvFS/CAS storage
  • 8. Shared mutable data ● Meta-data ● Evidence data ● Derived evidence data ● Provenance logs
  • 9. Shared mutable data ● Meta-data ● Evidence data ● Derived evidence data ● Provenance logs ● Anti forensics ? ● Forensic process integrity ?
  • 10. Integrity measures ● Trusted immutable data and meta-data – Lab side sealed digital evidence bags. ● Trusted provenance logging ● Capability based access control – Sparse capabilities – Mandatory Access Controls
  • 11. Spurious reads measures ● Opportunistic hashing ● Linux fadvise API ● Integration of message-bus and data-archive ● Priority queues → Sortable sets ● Provide pressure data for throttling purposes
  • 12. MattockFS ● User space file-system for forensic frameworks ● Zero-storage carving ● Hybrid data-archive/message-bus solution ● Helps minimize spurious read induced performance degradation. ● Provides write-once (meta) data ● Provides safe capability based API ● Provides trusted provenance logs
  • 13. MattockFS: Hooks ● Hooks for forensic frameworks – Framework based routing functionality – Pressure data for throttling new-data input – Queue size for throttling ● Hooks for multi-server setup – Geared to locality of data – Migrating CPU intensive jobs
  • 14. MattockFS Combining a zero-storage carving-capable high- integrity data-storage archive with trusted forensic logging, a capability secure API and a local page- cache friendly message bus solution.

Editor's Notes

  1. This is a short talk about the MattockFS computer-forensic file-system. MattockFS is a user space file-system that runs on the Linux platform and that is geared towards the specific needs for computer-forensic frameworks with respect to data storage and access and with respect to message-bus facilities.
  2. Before we get into MattockFS itself, you could say MattockFS has four important ancestors with respect to parts of its features. The Open Computer Forensics Architecture had a core component named the AnyCast. This was basically what today would be called a message broker or a message bus. CarvFS introduced the concept of CarvPath designations as a way to use pseudo file-paths to implement a feature known as in-line carving or zero-storage carving where you could extract files from a disk image without actually copying them out. Than the Sealed Digital Evidence Bag concept and MinorFS introduced system integrity properties and facilities that I will show are important with respect to the threath of anti-forensics.
  3. Lets start by looking at one of these MattockFS ancestors. The OCFA Anycast. The Anycast was a message bus solution, kind of like RabitMQ, but geared soly on the worker concept. It was part of a computer forensic framework named the Open Computer Forensics Architecture or OCFA. It used the concept of persistent priority queues that for all intent and purpose were infinite in size, a property that we later found out had some drawbacks.
  4. So what issues were there with the Anycast? The problem lies in the very nature of the computer forensic process. In this process, multiple tools in a tool chain are used consecutively on a single piece of data. The infinite size queues however combined with the lack of throttling in the OCFA framework, resulted in the fact that much data would long have been moved out of the operating systems page-cache at the moment the next tool would access the data. This would result in many page-cache misses and thus many spurious reads from the actual hard disks.
  5. Given the IO intensive nature of the computer forensic process, spurious reads can be quite a performance issue for the overall system. Page cache misses as induced by the Anycast design are one source of such spurious reads, but in OCFA there were more design issues also leading to even more spurious reads. Either by aggravating the page-cache miss problems, or for other reasons. OCFA used early hashing of all data, partialy resulting from its primary storage model that was based on Content Addressed Storage or CAS. This added extra time between the first read and the last read, thus resulting in increased page cache issues. Two other spurious read issues resulted from the lack of a “locality of data” based design and the server-side data entry.
  6. A second ancestor of MattockFS is CarvFS. CarvFS is a user space file-system that allows mounting a single computer forensic disk image, and allows to then carve files from unallocated space without actually having to copy the data out. CatvFS introduced the concept of CarvPath designations, like the one you see at the bottom of this slide. These designations are made up primarily of a combination of offset and size that together describe where in the mounted entity the actual data resides.
  7. While CarvFS was a big step forward for carving out files, in the light of usage within a computer forensic framework there were a number of issues. CarvFS was designed to mount a single disk image, In a large case, hundreds of disk images will be entered. In the case of OCFA this meant that hundreds of disk image files would have to be mounted as user-space file-system simultaneously, resulting in massive use of system resources that turned out to be prohibitive. An other issue was that for adding data in any other way, CAS storage was still needed. A hack was used to use CarvFS with OCFA that basically had modules bypass CarvFS altogether in order to create one single big archive file. The integration of CarvFS into OCFA never became fully optimal because CAS storage remained such an integral part of its core design.
  8. Modern insights with respect to system integrity force us to look at the use of shared mutable state within computer forensic frameworks. Basically a forensic framework weaves together a number of tools into suitable tool-chains for different types of data. Each module in such a framework accessing shared mutable data has the theoretic ability to change the shared mutable data. As non of the modules is expected to be malicious, the historic approach to shared mutable data also used in OCFA was that this would not be a serious issue.
  9. Enter anti-forensics. Tools used in the tool-chain, just like programs such as web browsers and PDF readers do have vulnerabilities. Remember that the suspect controls what data resided on his disk. There could be exploits targeting specific tools or libraries in the tool-chain, and these exploits could be used to target shared mutable data in our framework. Either corrupting the data or possibly even rendering the whole system useless. With respect to system integrity, experience with OCFA has shown some modules can just crash benignly if exposed to certain wrongly identified data files. Given that a random crash can have undefined behavior, there remains a serious risk that even non anti-forensic crashes might corrupt the system through shared mutable state.
  10. Ok, lets start looking at MattockFS now. First let look at the system integrity measures that MattockFS introduces. The concept of Sealed Digital Evidence Bags introduced us to the idea that data and meta-data should be sealed. MattockFS uses the fact that it runs under a different user id than the rest of the system to provide trusted write-once immutable data. It also uses this fact to provide a trusted provenance log facility. Data, meta-data and provenance log are protected from tempering by an exploit against one of the tools in the tool-chain. Access to the MattockFS file-system with respect to anything that might intervene with this goes through what is called capability based access control. We implement these in a way we borrowed from the MinorFS least authority file-system.
  11. Now that we have tackled the anti-forensics and integrity concerns, lets look at what MattockFS does for attenuating our spurious reads problem. Firs of all MattockFS has built-in hashing functionality. Whenever data is read from or written to in MattockFS, MattockFS tries if it can use this data to further calculating the hash of any CarvPath entity that this data is part of. Secondly, mattockFS communicates with the Linux kernel regarding archive data being accessed. It informs the kernel what data it will soon want to access again and what data it doesn't. This should allow the kernel to be smarter about page-cache handling. In order to allow the message bus and storage system to effectively work together, MattockFS integrates the message-bus into the file-system. If provides multiple policies for picking messages that are most beneficial to opportunistic hashing or page-cache pressure. Finaly MattockFS provides data to the module framework that should allow the framnework to properly implement throttling.
  12. A quick sumary of the main features of mattockFS. MattockFS is a user-space file-system for Linux that has facilities for zero storage carving. It is a hybrid data-archive and message-bus sytem geares pecifically at the needs of the computer forensic process. One of its primary goals is the minimization of spurious hard disk reads that hinder overal system performance. Next to performance, MattockFS provides multiple anti-anti-forensics facilities. Write-once access to data and meta-data, trusted provenance logs and a secure capability based form of access control for modules communicating with the file-sytem.
  13. There are some things that MattockFS only facilitates in by providing hooks. MattockFS expects the module framework on top of it to implement routing and file-type determination functionality. It provides the routing functionality with a possibility to maintain state. Also the module framework is expected to implement some form of throttling, especially for entering new data into the system, and to a lesser extent, adding new messages into the system. MattockFS provides data about how filled up the page-cache could be about to get, and about how much data and messages are queued for the different modules. Finally MattockFS provides hooks for creating multi-server setups. These are geared to respecting locality of data concerns and migrating only the most CPU intensive data to other servers.
  14. To wrap things up, this basically in short describes what MattockFS can provide. If you have any data-intensive problems that require a message-bus solution, MattockFS is an option you might want to explore. MattockFS is geared primarily at computer forensics, but should be suitable also for non-forensic data-intensive problems that favor a worker approach to concurrency