DEEPSEC 2013: Malware Datamining And Attribution
Upcoming SlideShare
Loading in...5
×
 

DEEPSEC 2013: Malware Datamining And Attribution

on

  • 4,079 views

Greg Hoglund explained at BlackHat 2010 that the development environments that malware authors use leaves traces in the code which can be used to attribute malware to a individual or a group of ...

Greg Hoglund explained at BlackHat 2010 that the development environments that malware authors use leaves traces in the code which can be used to attribute malware to a individual or a group of individuals. Not with the precision of name, date of birth and address but with evidence that a arrested suspects computer can be analysed and compared with the "tool marks" on the collected malware sample.

Statistics

Views

Total Views
4,079
Views on SlideShare
609
Embed Views
3,470

Actions

Likes
0
Downloads
547
Comments
0

25 Embeds 3,470

http://blog.michaelboman.org 3364
http://feeds.feedburner.com 34
http://www.directrss.co.il 14
http://cloud.feedly.com 12
http://secnews.se 8
http://3343693239455968046_626266685f5c1f358277e426456be2c6e3375cf8.blogspot.com 6
http://digg.com 5
http://newsblur.com 4
https://www.google.com 3
http://ercr4nwcx2fwifsw.onion 2
http://www.newsblur.com 2
http://www.feedspot.com 2
http://3343693239455968046_626266685f5c1f358277e426456be2c6e3375cf8.blogspot.com.au 2
http://3343693239455968046_626266685f5c1f358277e426456be2c6e3375cf8.blogspot.tw 1
http://3343693239455968046_626266685f5c1f358277e426456be2c6e3375cf8.blogspot.ru 1
http://3343693239455968046_626266685f5c1f358277e426456be2c6e3375cf8.blogspot.co.uk 1
http://3343693239455968046_626266685f5c1f358277e426456be2c6e3375cf8.blogspot.com.tr 1
http://translate.googleusercontent.com 1
http://3343693239455968046_626266685f5c1f358277e426456be2c6e3375cf8.blogspot.cz 1
http://www.google.fr&_=1391424598589 HTTP 1
http://www.google.fr&_=1391424593097 HTTP 1
http://feedly.com 1
http://yoleoreader.com 1
http://www.linkedin.com 1
http://www.hanrss.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

DEEPSEC 2013: Malware Datamining And Attribution DEEPSEC 2013: Malware Datamining And Attribution Presentation Transcript

  • Malware Attribution Theory, Code and Result
  • Who am I? • Michael Boman, M.A.R.T. project • Have been “playing around” with malware analysis “for a while” • Working for FireEye • This is a HOBBY project that I use my SPARE TIME to work on
  • Agenda Theory behind Malware Attribution Code to conduct Malware Attribution analysis Result of analysis
  • Theory
  • • Malware Attribution: tracking cyber spies - Greg Hoglund, Blackhat 2010 http://www.youtube.com/watch?v=k4Ry1trQhDk
  • What am I trying to do? Move this way Binary Human
  • What am I trying to do? Blacklists Binary Net Recon Command and Control Developer Fingerprints Tactics Techniques Procedures Social Cyberspace DIGINT Physical Surveillance HUMINT Human
  • What am I trying to do? Blacklists Binary Net Recon Command and Control Developer Fingerprints Tactics Techniques Procedures Social Cyberspace DIGINT Physical Surveillance HUMINT Human
  • Blacklists Net Recon Command and Control Developer Fingerprints Tactics Techniques Procedures Social Cyberspace DIGINT Physical Surveillance HUMINT
  • Physical Surveillance HUMINT Social Cyberspace DIGINT Developer Fingerprints Tactics Techniques Procedures Blacklists Net Recon Command and Control Actions / Intent Installation / Deployment CNA (spreader) / CNE (search & exfil tool) COMS Defensive / Anti-forensic Exploit Shellcode DNS, Command and Control Protocol, Encryption
  • Physical Surveillance HUMINT Social Cyberspace DIGINT Developer Fingerprints Tactics Techniques Procedures Blacklists Net Recon Command and Control Actions / Intent Installation / Deployment CNA (spreader) / CNE (search & exfil tool) COMS Defensive / Anti-forensic Exploit Shellcode DNS, Command and Control Protocol, Encryption
  • Steps • Step 0: Gather malware • Step 1: Extract metadata from binary • Step 2: Store metadata and binary in MongoDB • Step 3: Analyze collected data
  • Step 0: Gather malware • • • • VirusShare (virusshare.com) • Malware Domain List (www.malwaredomainlist.com/mdl.php) OpenMalware (www.offensivecomputing.net) MalShare (www.malshare.com) CleanMX (support.clean-mx.de/clean-mx/ viruses)
  • Step 1: Extract metadata from binary
  • Development Steps Source Core “backbone” sourcecode Machine Binary Tweaks & Mods Compiler 3rd party sourcecode 3rd party libraries Time Runtime libraries Paths MAC Address Malware Packing
  • Development Steps Source Core “backbone” sourcecode Machine Binary Tweaks & Mods Compiler 3rd party sourcecode 3rd party libraries Time Runtime libraries Paths MAC Address Malware Packing
  • Development Steps Source Core “backbone” sourcecode Machine Binary Tweaks & Mods Compiler 3rd party sourcecode 3rd party libraries Time Runtime libraries Paths MAC Address Malware Packing
  • Step 1: Extract metadata from binary • • • • • Hashes (for sample identification) • md5, sha1, sha256, sha512, ssdeep etc. File type / Exif / PEiD • Compiler / Packer etc. PE Headers / Imports / Exports etc. Virustotal results Tags
  • Identifying compiler / packer • PEiD • Python • peutils.SignatureDatabase().match_all()
  • PE Header information
  • VirusTotal Results
  • Tags • User-supplied tags to identify sample source and behavior • analyst / analyst-system supplied
  • Step 2: Store metadata and binary in MongoDB
  • Components • • Modified VXCage server • Stores malware & metadata in MongoDB instead of FS / ORDBMS Collects a lot more metadata then the original
  • VXCage REST API • • • /malware/add • Add sample /malware/get/<filehash> • Download sample. If no local sample, search other repos /malware/find • Search for sample by md5, sha256, ssdeep, tag, date • /tags/list • List tags
  • Step 3: Analyze collected data
  • Identifying development environments • Compiler / Linker / Libraries • Strings • Paths • PE Translation header • Compile times • Number of times a software been built
  • Cataloging behaviors • Packers • Encryption • Anti-debugging • Anti-VM • Anti-forensics
  • Result
  • Have I seen you before? • Detects similar malware (based on SSDEEP fuzzy hashing)
  • Different MD5, 100% SSDeep match
  • SSDEEP Analysis (3007)
  • SSDEEP Analysis (3007)
  • SSDEEP Analysis (851)
  • Challanges • Party handshake problem: • 707k samples analyzed and counting (resulting in over 250 billion compares!) • Need a better target (pre-)selection
  • What compilers / packers are common? 1. "Borland Delphi 3.0 (???)", 54298 2. "Microsoft Visual C++ v6.0", 33364 3. "Microsoft Visual C++ 8", 28005 4. "Microsoft Visual Basic v5.0 - v6.0", 26573 5. "UPX v0.80 - v0.84", 22353
  • Are there any unidentified packers? • How to identify a packer • PE Section is empty in binary, is writable and executable
  • How common are antidebugging techniques? • 31622 out of 531182 PE binaries uses IsDebuggerPresent (6 %) • Packed executable uncounted
  • Analysis Coverage Source Core “backbone” sourcecode Machine Binary Tweaks & Mods Compiler 3rd party sourcecode 3rd party libraries Time Runtime libraries Paths MAC Address Malware Packing
  • Future
  • What am I trying to do in the future Blacklists Binary Net Recon Command and Control Developer Fingerprints Tactics Techniques Procedures Social Cyberspace DIGINT Physical Surveillance HUMINT Human Expand scope of analysis +network +memory +os changes +behavior
  • What am I trying to do in the future • More automation • More modular design • Solve the “Big Data” issue I am getting myself into (Hadoop?) • More pretty graphs
  • Thank you • Michael Boman • michael@michaelboman.org • @mboman • http://blog.michaelboman.org • Code available at https://github.com/ mboman/vxcage