Successfully reported this slideshow.
Your SlideShare is downloading. ×

Malware Collection and Analysis via Hardware Virtualization

Loading in …3

Check these out next

1 of 36 Ad

More Related Content

Slideshows for you (20)

Similar to Malware Collection and Analysis via Hardware Virtualization (20)


Malware Collection and Analysis via Hardware Virtualization

  1. 1. Malware Collection and Analysis via Hardware Virtualization Tamas K Lengyel Computer Science and Engineering 11/10/2015
  2. 2. Outline 1. Introduction and Problem statement 2. Background, Challenges & Approach 3. Limitations and scope 4. Publications to date 5. Malware collection system & results 6. Malware analysis system & results 7. Hardware and software limitations 8. Contributions 9. Future work
  3. 3. Introduction • 1,000,000 new malware binaries a day • Thwarting malware requires in-depth understanding of its operation • Collect and analyze malware • Existing tools and techniques are impeded by modern malware techniques • Packing, evasion and metamorphism • Hardware virtualization has been proposed to counter these techniques
  4. 4. Requirements 1. Scalability Maximizing the number of concurrently active collection and analysis sessions on limited hardware resources 2. Stealth Detecting the monitoring environment should be prevented 3. Fidelity The collected data has to be accurate 4. Isolation Monitoring components have to be securely isolated and we need to prevent cross-contamination
  5. 5. Prominent prior work • 2005: Vrable et al. - Scalability, fidelity, and containment in the potemkin virtual honeyfarm • 2008: Payne et al. - Lares: An architecture for secure active monitoring using virtualization • 2008: Dinaburg et al. - Ether: malware analysis via hardware virtualization extensions • 2013: Deng et al. - Spider: Stealthy binary program instrumentation and debugging via hardware virtualization
  6. 6. Problem statement Developing effective anti-malware technologies requires the collection and rapid analysis of an increasing number of malware samples such that all four requirements are met simultaneously. No comprehensive evaluation to date has been performed to determine whether virtualization is an effective platform for the development of such tools.
  7. 7. Virtualization
  8. 8. Challenges 1. Scalability Disk and memory requirements are linear 2. Stealth In-guest tools can be detected 3. Isolation In-guest tools can be disabled Cross-contamination of VMs over the network 4. Fidelity Data collection is negatively impacted by 2 & 3
  9. 9. Our approach 1. Study current malware techniques 2. Develop out-of-guest tools 3. Conduct live experiments 4. Evaluate results 5. Study shortcomings and limitations
  10. 10. Limitations 1. Definition of malware Constantly evolving and undefined set 2. Measurements and metrics Requirements are not always quantifiable Results are only indicative, not definitive We work to counter current malware techniques 3. Repeatability of experiments External entities outside our control
  11. 11. Scope • Malware analysis vs. malware detection Black Box Analysis We only aim at collecting relevant information which may aid malware detection • Detection of virtualization vs. detection of monitoring Virtualization is already widely deployed • Determining when we collected enough data Halting problem
  12. 12. Publications • CSET’12: Virtual Machine Introspection in a Hybrid Honeypot Architecture. Acceptance rate: 48% • NSS’13: Towards Hybrid Honeynets via Virtual Machine Introspection and Cloning. Acceptance rate: 24% • SHCIS’14: Multi-tiered Security Architecture for ARM via the Virtualization and Security Extensions • MMF’14: Pitfalls of Virtual Machine Introspection on Modern Hardware. • MMF’14: Code Validation for Modern OS Kernels • ACSAC’14: Scalability, Fidelity and Stealth in the DRAKVUF Dynamic Malware Analysis system. Acceptance rate: 19.9% • SHCIS’15: Virtual Machine Introspection with Xen on ARM • C&TC’15: CloudIDEA: A Malware Defense Architecture for Cloud Data Centers. Acceptance rate: 38%
  13. 13. Malware collection Primary requirement: capture malware binaries • Scalability: Deploy copy-on-write disk and memory sharing • Stealth: No in-guest agents, no modification to the hypervisor • Isolation: External agent + network isolation • Fidelity: Kernel heap pool-tag scanning
  14. 14. Network Isolation
  15. 15. Fidelity via pool tag scanning struct { union { struct { uint16_t previous_size:9; uint16_t pool_index :7; uint16_t block_size :9; uint16_t pool_type :7; }; uint16_t flags; }; uint32_t pool_tag; } _POOL_HEADER
  16. 16. Captured malware samples
  17. 17. Results: scalability
  18. 18. Malware analysis Primary requirement: capture useful live data • Scalability: Re-use CoW techniques from prior experiments • Stealth: No in-guest agents, no modification to the hypervisor, command injection with VMI • Isolation: VLAN tagging, TCB disaggregation • Fidelity: Syscalls and kernel heap-allocations
  19. 19. Useful data? Goal is to generate data that is complete in order to be useful for analysis Data-collection should be flexible to allow tuning to specific requirements Two main objectives defined in prior art: 1. Syscall monitoring 2. Kernel heap monitoring We also will monitor deleted files as we deemed that an interesting and useful addition
  20. 20. System design
  21. 21. Syscall trapping Stealthy breakpoint injection method: 1. Overwrite internal kernel function entry points with #BP (0xCC) 2. Read/write protect page with EPT 3. When traps hit, place back original byte 4. Singlestep 1 instruction 5. Place breakpoint back again Can monitor all internal kernel functions, not just system calls!
  22. 22. Heap-allocation trapping
  23. 23. Command injection
  24. 24. Syscalls of 115k malware
  25. 25. Heap allocs of 115k malware
  26. 26. Files deleted File size 100KByte+
  27. 27. Stalling malware Standard methods • Detection of virtualized environments • Detection of in-guest artifacts • Sleeping Advanced methods • Time-skew detection • API spamming
  28. 28. API spamming • Repeatedly call monitored APIs which normally complete fast • NtCreateSemaphore • Logging these calls will take more time • Spamming these times-out the monitoring Use of NtCreateSemaphore in 60s: Observed in: 45,383 samples. Average: 7.77 Samples significantly above average: 1 Number of calls: 17,453
  29. 29. Summary Hardware virtualization is effective for both malware collection and analysis All four requirements can be met simultaneously using hardware virtualization The technology is sufficiently flexible to develop and fine-tune data collection techniques Major improvement in the arms-race against malware
  30. 30. Software limitations Race-condition with multiple vCPUs
  31. 31. Hardware limitations on x86 EPT only reports violation start address Read/write operation may be up to 8 bytes long
  32. 32. Hardware limitations on x86 sTLB makes TLB-splitting attacks no longer feasible TLB can still be used to hide mappings from VMI
  33. 33. Hardware limitations on ARM Split-TLB architecture without sTLB Hardware-assisted translation available from the VMM Translation is performed as data-fetch access • Only hits the dTLB Hiding code-pages on ARM is possible via split-TLB attacks
  34. 34. Contributions 1. Identified core requirements that must be met simultaneously 2. Developed and open-sourced the prototypes, with major contributions to existing systems 3. Performed extensive tests with modern malware 4. Identified hardware and software limitations that must be addressed when building such systems
  35. 35. Future work • Keeping up with the evolving threat landscape • Attacks against the hypervisor and lower layers • Data-only malware • Stalling malware • Making use of new and evolving hardware virtualization extensions • Hybrid VMI • Data-mining the collected information • Identifying malware groups • Creating IDS/IPS rules
  36. 36. Questions? • Dissertation text available at • DRAKVUF • LibVMI