AbstractWriting digital forensics(DF) tools is difficult because of the diversity of
IntroductionAs the field of digital forensics (DF) continues to growFew of today’s forensic tool developers have formal t
Meaning of digital forensics softwarery dumps, network packet captures, program executable
The use of DF tools 1-criminal investigations 2-internal investigations. 3-audits.of which have different standards for chain-of-custody , admissibility , and scientific validit
Hackers hide data in several ways and steganography techniques but can be caught by artifacts , copy forge techniqu bad sectors or using alternate data stream (ADS) like C:notepade file.txt:hide (hee files securely for good you need to use Gutmann algorithm for writing 35 times ra
Distinct Sector Hashes for Target file detectionHashing files to check for file changesHashing sectors to discover changes in file segmentHashing algorithm depends on probability so it wont hash the whole drive becLooking for distinct hashes and repeated file patterns using Government data,Algorithm using urn statistic problem for finding sectors that need to be inspec
Finding distinct and repeated hashes in hard disk sectors
Using different data structures and testing the speed for the file system
Network forensics Network forensics challenges : Cloud computing challenges needed new toolsNew frontiers in network intrusion starting from the firewall Emerging Network forensic areas: Social networks Data mining Digital imaging and data visualization
Applying network forensics in critical infrastructures Botnets Wireless networks still lacking good forensic toolsSink holes:accept,analyze and forensically store attack traffic
SCADA (Supervisory control and data acquisition) ChallengesInstalls forensic tools at layers 0-2
Smart phone security challengesSmart phone threat model showing malware spreading from the application layer to th
Lessons in digital forensicsThe challenge of data diversity1-processing incomplete or corrupt data.2-Why data will not validate?3-Windows inconsistencies.4-Eliminate data that are consistent.Data Scale challenges1-The amount of data.2-Applying big data solutions to DF.
ub-linear algorithms for reading sectohms that operate by sampling data. Sampling is a powerful technique and can frequently ﬁhe absence of data: the only way to establish that there are no written sectors on a hard d
Temporal diversity: the never-ending upgrade cycleMany computer users have learned that upgrades are1-Upgrading forensics tools2-Software Versions to be upgraded3-Encase forensics tool4-Intelligent forensics tools
Human capital demands and limitations 1-It was found that users of DF software come overwhelming 2-Examiners that have substantial knowledge in one area (e.g 3-developers also with skills like opcodes, multi-threading, Organization of processes and operating system data structu
The CSI EffectHard to recover data in realityHard to recover data from Hard diskRecovering data from hard drives typically involves decodingFunding problemsThe differences between Windows Explorer and EnCase Fore
Lessons learned managing a research corpusThis project started in 1998 and has expanded to incldownloaded from US Government web servers, disk i
Corpus management --technical issues 1-Imaging ATA drives Lesson: read the documentation for the computer that you are using. Lesson: make the most of the tools that you have and follow the technical innovation (Because you are dealing with hard disks with different technologies whether
2-Automation as the key to corpus managementNeeded a process for capturing the hard disk make,model, serial numb Lesson: automation is key; any process that involves manual record keLesson: useful data will outlive the system in which it is stored, so mak
3-Evidence file formats(customer container file)Trying to use his own container files did not work well and he had to use standard coLesson: avoid developing new file formats has never been possible. Lesson: kill your darlings.4-Crashes from bad drivesCauses of crash are many as it could be kernel memory overwritten or faulty drive or Lesson: many technical options remain unexplored.
5- Drive failures produce better dataAlgorithm1: Developed an algorithm that reads fromAlgorithm2: developed a disk imaging program called
Lessons learnedLesson: Drives with some bad sectors invariably have more sensitive inLesson: do research, and only to maintain software that implements a p
6- Numbering and namingAlgorithm1: developed an algorithm that was generating filesLesson: Names must be short enough to be usable but long eWhen I started acquiring data outside the US I discovered that the country of origin wa batch number allows different individuals in the same country to assign their own nLesson: although it is advantageous to have names that contain no semanticcontent, it is significantly easier to work with names that havesome semantic meaning.
7- Path names• Lesson: place access-control information as near tothe root of a path name as possible.
8- Anti-virus and indexingLesson: Configure anti-virus scanners and other indexing tools to ignore directo9- Distribution and updatesLesson: solutions developed by other disciplines for distributing large files rarely wor
Corpus management–policy issues1- Privacy issues Lesson: just because something is legal, you may wish to think twice before you do it.2- Illegal content financial, passwords, and copyrightLesson: never sell access to DF data, even if you have personal ownership.Lesson: understand Copyright Law before copying other people’s data.Lesson: make sure your intent is scientific research, not fraud, so that any collection of access3- Illegal content pornography Lesson: do not give minors access to real DF data; do not intentionally extract pornography fro4- Institutional Review BoardsLesson: While IRBs exist to protect human subjects, manyhave expanded their role to protect institutions and experimenters.Unfortunately this expanded role occasionally decreases the protection afforded human subjethe IRB watching over you, it’s important to watch your back.
Lessons learned developing DF tools1- Platform and language2- Parallelism and high performance computing3- All-in-one tools vs. single-use tools4- Evidence container file formats
1- Platform and language1- The easiest way to write multi-platform tools is to write command-li2-Although C has historically been the DF developer’s language of choic3-Java has a reputation for being slow especially for high computationa4-While it is easy to write programs in Python, experience to date has s
2-Parallelism and high performance computingications bottlenecks and a lot of times host computer processor is better th
3- All-in-one tools vs. single-use toolsMy experience argues that itis better to have a single tool than many: If there are many tools, most investigators will want to have them all. Splitting functi Much of what a DF tools does ---data ingest, decoding and enumerating data structu There is a finite cost to packaging, distributing, and promoting a tool. When a tool ha
4- Evidence container file formatsshould be allowed to process inputs in any format and transparently handle disk images in 2-With network packets the situation is better, with pcap being the universal format.
Famous digital forensics tools Encase FTK PTK Forensics Nuix Microsoft Intilla Cofee
Conclusion1-Digital Forensics is an exciting area in which to work, but it is exceedingly difficult b2-These problems are likely to get worse over time, and our only way to survive the c3-in building and maintaining this corpus he encountered many problems that are in