Successfully reported this slideshow.
Your SlideShare is downloading. ×

Ninja Correlation of APT Binaries

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Loading in …3
×

Check these out next

1 of 35 Ad

Ninja Correlation of APT Binaries

Download to read offline

Knowledge and identification of Malware binaries is a crucial part of detection and incident response. There was a time when using MD5s was sufficient to ID binaries. The reverse engineering analysis conducted once would be useful anytime that same MD5 hash was seen again. This has rapidly changed in recent years. Polymorphic samples of the same specimen change the file hash (MD5, SHAx etc) without much effort by the attacker. Also, cyber criminals and advanced adversaries reuse their codebase to create newer versions of their malware, but changes in the file hash disallow any opportunity to connect and leverage previous analyses of similar samples by defenders. This gives them an asymmetric advantage.

In recent years, there has been research into “similarity metrics”― methods that can identify whether, or to what degree, two malware binaries are similar to each other. Imphash, ssdeep and sdhash are examples of such techniques. In this talk, Bhavna will review which of these techniques is more suitable for evaluating similarities in code for APT related samples. This presentation will take a data analytics approach. We will look at binary samples from APT events from Jan- Mar 2015 and create clusters of similar binaries based on each of the three similarity metrics under consideration. We will then evaluate the accuracy of the clusters and examine their implications on the effectiveness of each technique in identifying provenance of an APT related binary. This can aid Incident responders in connecting otherwise disparate infections in their environment to a single threat group and apply past analyses of the abilities and motivations of that adversary to conduct more effective response.

Knowledge and identification of Malware binaries is a crucial part of detection and incident response. There was a time when using MD5s was sufficient to ID binaries. The reverse engineering analysis conducted once would be useful anytime that same MD5 hash was seen again. This has rapidly changed in recent years. Polymorphic samples of the same specimen change the file hash (MD5, SHAx etc) without much effort by the attacker. Also, cyber criminals and advanced adversaries reuse their codebase to create newer versions of their malware, but changes in the file hash disallow any opportunity to connect and leverage previous analyses of similar samples by defenders. This gives them an asymmetric advantage.

In recent years, there has been research into “similarity metrics”― methods that can identify whether, or to what degree, two malware binaries are similar to each other. Imphash, ssdeep and sdhash are examples of such techniques. In this talk, Bhavna will review which of these techniques is more suitable for evaluating similarities in code for APT related samples. This presentation will take a data analytics approach. We will look at binary samples from APT events from Jan- Mar 2015 and create clusters of similar binaries based on each of the three similarity metrics under consideration. We will then evaluate the accuracy of the clusters and examine their implications on the effectiveness of each technique in identifying provenance of an APT related binary. This can aid Incident responders in connecting otherwise disparate infections in their environment to a single threat group and apply past analyses of the abilities and motivations of that adversary to conduct more effective response.

Advertisement
Advertisement

More Related Content

Slideshows for you (20)

Similar to Ninja Correlation of APT Binaries (20)

Advertisement

More from CODE BLUE (20)

Recently uploaded (20)

Advertisement

Ninja Correlation of APT Binaries

  1. 1. N I N J A C O R R E L AT I O N O F A P T B I N A R I E S E VA L U A T I N G T H E E F F E C T I V E N E S S O F F U Z Z Y H A S H I N G T E C H N I Q U E S I N I D E N T I F Y I N G P R O V E N A N C E O F A P T B I N A R I E S Bhavna Soman Cyber Analyst/Developer, Intel Corp. @bsoman3, #codeblue_jp Copyright © Intel Corporation 2015. All rights reserved.
  2. 2. - G E O R G E P. B U R D E L L Opinions expressed are those of the author and do not reflect the opinions of his/her employer. - L E G A L Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at intel.com. D I S C L A I M E R S
  3. 3. W H AT A D VA N TA G E C A N K N O W I N G T H E O R I G I N S O F A M A L I C I O U S B I N A RY G I V E Y O U ? ? • We can apply past analyses of motivations and capabilities of adversary • Connect disparate events into one whole picture • So what’s the best way to connect the dots?
  4. 4. A G E N D A • Methods to connect binaries • Getting a test dataset and ground truth • Results • Sample clusters found • Takeaways and Future direction
  5. 5. W H AT I S T H E B E S T WAY T O C O N N E C T S I M I L A R B I N A R I E S ? ? • Imphash— md5 hash of the import table • ssdeep— Context triggered piecewise hashing • SDhash— Bloom filters How to : 1. Get non-trivial dataset of binaries related to targeted campaigns 2. Establish ground truth without static/dynamic analyses of hundreds of binaries?
  6. 6. G AT H E R I N G D ATA • Published Jan-March 2015 • e.g. “Project Cobra Analysis”, “The Desert Falcon Targeted Attacks” • Extract MD5s • >10% Malicious on Virus Total MD5s Similarity Metrics • Calculate for each binary • Import hash • ssdeep • SDhash EXTRACT CALCULATE APT Whitepapers
  7. 7. A S S E S S I N G C O R R E L AT I O N S Are these malware related?
  8. 8. {Actor Names, Campaign Name, Malware Families, Aliases} APT1 APT2 {Actor Names, Campaign Name, Malware Families, Aliases} A S S E S S I N G C O R R E L AT I O N S
  9. 9. • No one method found all the correlations • Imphash had the most false positives • Sdhash had maximum recall • Both ssdeep and SDhash had near perfect precision S U M M A RY R E S U LT S Recall Precision
  10. 10. I M P H A S H E S
  11. 11. • 4 0 8 T R U E C O R R E L AT I O N S • 1 7 2 FA L S E P O S I T I V E S • H I G H F I D E L I T Y T R U E P O S I T I V E S • 2 C O R R E L AT I O N S A C R O S S C A M PA I G N S B Y T H E S A M E A C T O R • N O C O R R E L AT I O N S B E T W E E N D I F F E R E N T V E R S I O N S O F T H E S A M E M A LWA R E • N O C O R R E L AT I O N S A C R O S S PA RT S O F T H E K I L L C H A I N W I T H I N C A M PA I G N I M P H A S H S C O R E FREQUENCY
  12. 12. I M P H A S H
  13. 13. I M P H A S H • S AV s a m p l e s c i r c a 2 0 1 1 • U s e d b y t h e Wa t e r b u g A tt a c k g r o u p • A KA Tu r l a / U r u b o r o s • Ve r s i o n 1 . 5 o f Co m R AT ( Tu r l a A tt a c ke r s ) • Co m p i l e d o n M a r c h 2 5 , 2 0 0 8 • O t h e r v e r s i o n s o f t h e R AT i n t h e d a t a s e t w e r e n o t c o n n e c t e d • W i p b o t 2 0 1 3 S a m p l e s • U s e d b y t h e Wa t e r b u g a tt a c k G r o u p • Co m p i l e d o n 1 5 - 1 0 - 2 0 1 3 • A l s o r e f e r r e d t o a s Ta v d i g / Wo r l d C u p S e c / Ta d j M a k h a l
  14. 14. I M P H A S H
  15. 15. I M P H A S H • B o t h s a m p l e s o f Co m R AT • A s s o c i a t e d w i t h Wa t e r b u g G r o u p a n d Tu r l a A tt a c ke r s r e s p e c t i v e l y • S a m p l e s o f t h e Ca r b o n M a l w a r e • R e l a t e d t o Pr o j e c t Co b r a a n d T h e Wa t e r b u g A tt a c k G r o u p
  16. 16. I M P H A S H
  17. 17. I M P H A S H • C r e d e n t i a l s t e a l e r a n d d r o p p e r f r o m O P A r i d V i p e r • V s . D r o p p e r s u s e d b y A tt a c k s o n t h e Sy r i a n O p p o s i t i o n Fo r c e s • N o c o m m o n a tt r i b u t i o n o r K N O W N l i n k • B i n a r i e s f r o m S I X d i ff e r e n t c a m p a i g n s • N o c o m m o n A c t o r o r M a l w a r e Fa m i l y • D i ff e r e n t p a r t s o f t h e K i l l c h a i n
  18. 18. I M P H A S H
  19. 19. S S D E E P
  20. 20. • 8 5 6 T R U E C O R R E L AT I O N S • 0 FA L S E P O S I T I V E S • 1 C O R R E L AT I O N S F O U N D C O N N E C T I N G C A M PA I G N S B Y T H E S A M E A C T O R • S E V E R A L C O R R E L AT I O N S B E T W E E N M I N O R V E R S I O N S O F S A M E M A LWA R E • N O C O R R E L AT I O N S A C R O S S PA RT S O F T H E K I L L C H A I N W I T H I N C A M PA I G N S S D E E P S C O R E FREQUENCY
  21. 21. S S D E E P
  22. 22. S S D E E P • W i p b o t 2 0 1 3 • U s e d b y t h e Wa t e r b u g a tt a c k g r o u p • Co r r e l a t i o n a c r o s s m i n o r v e r s i o n s o f Co m R AT • Co m p i l e d a t e s s p a n o v e r 3 y e a r s • S AV / U r u b o r o s s a m p l e s • U s e d b y t h e Wa t e r b u g A tt a c k g r o u p • T i m e s t a m p e d 2 0 1 3
  23. 23. S S D E E P
  24. 24. S S D E E P • B a c k d o o r s u s e d i n O P D e s e r t Fa l c o n ( Ka s p e r s k y ) • 6 3 0 Co r r e l a t i o n s . A v e r a g e s i m i l a r i t y s c o r e w a s 3 5 . 1 3 • D i ff e r e n t Ve r s i o n s o f Ca r b o n M a l w a r e c o m p l i e d i n 2 0 0 9 • Fr o m Pr o j e c t Co b r a a n d Wa t e r b u g Ca m p a i g n s .
  25. 25. S S D E E P
  26. 26. S S D E E P N O FA L S E P O S I T I V E S
  27. 27. S D H A S H
  28. 28. • T H R E S H O L D = 1 0 • 1 4 1 2 T R U E C O R R E L AT I O N S • 3 FA L S E P O S I T I V E S • 1 C O R R E L AT I O N S F O U N D C O N N E C T I N G C A M PA I G N S B Y T H E S A M E A C T O R • S E V E R A L C O R R E L AT I O N S B E T W E E N M I N O R V E R S I O N S O F S A M E M A LWA R E • 1 C O R R E L AT I O N S A C R O S S PA RT S O F T H E K I L L C H A I N W I T H I N C A M PA I G N S D H A S H S C O R E FREQUENCY
  29. 29. S D H A S H
  30. 30. S D H A S H • Co r r e l a t i o n b e t w e e n D r o p p e r, S t a g e 1 , S t a g e 2 a n d I n j e c t e d L i b r a r y o f Co b r a Ca m p a i g n • H i g h s i m i l a r i t y w i t h Ca r b o n To o l u s e d b y t h e Wa t e r b u g g r o u p • W i d e l y v a r y i n g AV l a b e l s e v e n c o n t r o l l i n g f o r v e n d o r • Co r r e l a t i o n s m a d e b y s d h a s h o n l y • S AV / U r u b o r o s s a m p l e s • 3 0 d i ff e r e n t B i n a r i e s c o m p i l e d o v e r 3 m o n t h s i n 2 0 1 3
  31. 31. S D H A S H
  32. 32. S D H A S H • B a c k d o o r u s e d b y O P D e s e r t Fa l c o n • V s . S c a n b o x s a m p l e ( k n o w n t o b e r e l a t e d t o A n t h e m a tt a c k s a n d D e e p Pa n d a ) • N o k n o w n r e l a t i o n s h i p b e t w e e n t h o s e a c t o r s / c a m p a i g n s / m a l w a r e f a m i l i e s • “ H tt p B r o w s e r ” m a l w a r e u s e d i n A n t h e m a tt a c k • “A m m y A d m i n ” t o o l u s e d b y t h e Ca r b a n a k g r o u p
  33. 33. S D H A S H
  34. 34. W H E R E W E S TA N D • Imphash, ssdeep or SDhash?? • Path finding-ish. Engineer systems to make connections • APT binaries may reuse code —use it against them • It pays to know your adversary.
  35. 35. A C K S / Q & A / T H A N K S ! @bsoman3, bhavna.soman@ {intel.com, gmail.com} • Chris Kitto and Jeff Boerio for helping me make better slides. • Wonderful folks that write Security white papers • @kbandla for creating and maintaining APTNotes • Virus Total for the great data they provide

×