Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

PeekaTorrent: Leveraging P2P Hash Values for Digital Forensics

646 views

Published on

Presentation from DFRWS'16 USA

Published in: Internet
  • Be the first to comment

  • Be the first to like this

PeekaTorrent: Leveraging P2P Hash Values for Digital Forensics

  1. 1. PeekaTorrent Leveraging P2P Hash Values for Digital Forensics Sebastian Neuner, Martin Schmiedecker, Edgar Weippl
  2. 2. Problem Description We are drowning in data: • processes and best-practices do not scale well • 12TB hard drives recently presented • sector hashing, unziping and unpacking, ... 2/21
  3. 3. Problem Description 3/21
  4. 4. Problem Description We’d like to ignore: 4/21
  5. 5. Problem Description Hashing is prevalent: • DHTs, P2P file-sharing (SHA-1) • Dropbox (4MB, SHA-256) • file whitelisting (NSRL): ◦ full file (SHA256, SHA1 & MD5) ◦ fuzzy files (ssdeep, sdhash) ◦ blocks (MD5b4096, MD5b8192) 5/21
  6. 6. Problem Description Hashing is prevalent: • DHTs, P2P file-sharing (SHA-1) • Dropbox (4MB, SHA-256) • file whitelisting (NSRL): ◦ full file (SHA256, SHA1 & MD5) ◦ fuzzy files (ssdeep, sdhash) ◦ blocks (MD5b4096, MD5b8192) 6/21
  7. 7. Problem Description Couldn’t we add to this: • exclude commonly found files • mostly totally irrelevant for investigation • even before looking at files manually 7/21
  8. 8. peekaTorrent
  9. 9. peekaTorrent 8/21
  10. 10. peekaTorrent General idea: • leverage publicly shared hash values • more granular than files, but less than sectors • it’s all in the .torrent • copyright-free! 9/21
  11. 11. peekaTorrent BitTorrent uses chunking: • all files are concatenated • then split in chunks (=pieces) • most often 256kb, (observed 16kb-16mb) • depending on implementation and user preference 10/21
  12. 12. peekaTorrent Instead of hashing sectors, or files: • variable hash windows (2n ) • iterate over each sector • build on bulk extractor Then pipe it all into hashdb, see what drops out 11/21
  13. 13. peekaTorrent Benefits: • also deleted & partially overwritten files • fast! • less false-positives • hashdb files can be easily shared 12/21
  14. 14. peekaTorrent Use cases: • file whitelisting (torrents) • file blacklisting • custom hashsets: source code, email attachements, sharepoint, ... 13/21
  15. 15. peekaTorrent Simplistic use: • create torrent with files of interest • don’t publish/announce it • pipe into hashdb, done 14/21
  16. 16. Evaluation 15/21
  17. 17. Evaluation Collected data: • in total: 2.65 million torrent files • crawling Piratebay & KAT • multiple data dumps • 3.3 billion unique chunk hashes • up to 2.6 PB of data 16/21
  18. 18. Evaluation Some numbers: • 1 GB filesystem, Ubuntu Desktop = 1158 chunks • running bulk extractor: 220s (Notebook), 23s (Server) • running hashdb: few seconds 17/21
  19. 19. Limitations Non-usable data: • chunks consisting of two files • fragementation on disk 18/21
  20. 20. Future Work What’s needed: • more .torrents! • more data • investigate data set more closely (duplicates) • get feedback 19/21
  21. 21. Sharing is Caring 20/21
  22. 22. Questions? Thank you for your attention URL: https://peekatorrent.org Email: mschmiedecker@sba-research.org Twitter: @fr333k 21/21

×