Your SlideShare is downloading. ×
How to really obfuscate   your pdf malware
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

How to really obfuscate your pdf malware

5,506

Published on

Published in: Technology
1 Comment
6 Likes
Statistics
Notes
No Downloads
Views
Total Views
5,506
On Slideshare
0
From Embeds
0
Number of Embeds
25
Actions
Shares
0
Downloads
214
Comments
1
Likes
6
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. How to really obfuscate your PDF malware Sebastian Porst - ReCon 2010 Email: sebastian.porst@zynamics.com Twitter: @LambdaCube 1
  • 2. Targeted Attacks 2008 Adobe Acrobat Microsoft Word; Reader; 28.61% 34.55% Microsoft PowerPoint; Microsoft Excel; 16.87% 19.97% http://www.f-secure.com/weblog/archives/00001676.html 2
  • 3. Targeted Attacks 2009 Adobe Acrobat Microsoft Word; Reader; 48.87% 39.22% Microsoft Microsoft PowerPoint; Excel; 7.39% 4.52% 3
  • 4. Exploited in the wild CVE- CVE- CVE- CVE- 2007- 2009- 2009- 2009- 5659 0658 1492 4324 CVE- CVE- CVE- CVE- 2008- 2009- 2009- 2010- 2992 0927 3459 0188
  • 5. Four common exploit paths Broken PDF Parser Vulnerable JavaScript Engine Vulnerable external libraries /Launch 5
  • 6. PDF Malware Obfuscation Different tricks for different purposes Make manual analysis more difficult Resist automated analysis Avoid detection by virus scanners 6
  • 7. PDF Malware Obfuscation Conflicting goals Avoid detection Make analysis by being difficult by being wellformed malformed 7
  • 8. How to achieve these goals Being harmless Being evil • Avoid JavaScript • Use heavy • Do not use unusual obfuscation encodings • Try to break tools • Do not try to break parser-based tools • Ideally use an 0-day 8
  • 9. Let‘s be evil 9
  • 10. Breaking tools
  • 11. Rule #1: Do the unexpected 11
  • 12. This is what tools expect • ASCII Strings • Boring encodings like #41 instead of A • Wellformed or only moderately malformed PDF file structure 12
  • 13. Malformed documents • Adobe Reader tries to load malformed PDF files • Very, very liberal interpretation of the PDF specification • Parser-based analysis tools need to know about Adobe Reader file correction 13
  • 14. Malformed PDF file – Example I 7 0 obj << /Type /Action /S /JavaScript /JS (app.alert('whatever');) >> endobj 14
  • 15. Malformed PDF file – Example II 5 0 obj << /Length 45 >> stream some data endstream endobj 15
  • 16. Further reading 16
  • 17. Obfuscating JavaScript code
  • 18. Goal of JavaScript obfuscation Hide the shellcode 18
  • 19. JavaScript obfuscation in the wild • Screwed up formatting • Name obfuscation • Eval-chains • Splitting JavaScript code • Simple anti-emulation techniques • callee-trick • ... 19
  • 20. Screwed up formatting • Basically just remove all newlines • Completely useless: jsbeautifier.org 20
  • 21. Name obfuscation • Variables or function names are renamed to hide their meaning • Most JavaScript obfuscators screw this up 21
  • 22. Obfuscation example: Original code function executePayload(payload, delay) { if (delay > 1000) { // Whatever } } function heapSpray(code, repeat) { for (i=0;i<repeat;i++) { code = code + code; } } 22
  • 23. Obfuscation without considering scope function executePayload(hkof3ewhoife, fhpfewhpofe) { if (fhpfewhpofe > 1000) { // Whatever } } function heapSpray(hoprwehjoprew, hoifwep43) { for (jnpfw93=0;jnpfw93<hoifwep43;jnpfw93++) { hoprwehjoprew = hoprwehjoprew + hoprwehjoprew; } } 23
  • 24. Obfuscation with considering scope function executePayload(grtertttrr, hnpfefwefee) { if (hnpfefwefee > 1000) { // Whatever } } function heapSpray(grtertttrr, hnpfefwefee) { for (hjnprew=0;hjnprew<hnpfefwefee;hjnprew++) { grtertttrr = grtertttrr + grtertttrr; } } 24
  • 25. Obfuscation: Going the whole way function ____(____, _____) { if (_____ > 1000) { // Whatever } } function _____(____, _____) { for (______=0; ______<_____; ______++) { ____ = ____ + ____; } } 25
  • 26. Name obfuscation: Lessons learned • Consider name scope – Deobfuscator needs to know scoping rules too • Use underscores – Drives human analysts crazy • Also cute: Use meaningful names that have nothing to do with the variable – Maybe shuffle real variable names 26
  • 27. Eval chains • JavaScript code can execute JavaScript code in strings through eval • Often used to hide later code stages which are decrypted on the fly • Common way to extract argument: replace eval with a printing function 27
  • 28. Eval chains: Doing it better • Make sure your later stages reference variables or functions from earlier stages • Re-use individual eval statements multiple times to make sure eval calls can not just be replaced 28
  • 29. JavaScript splitting • JavaScript can be split over several PDF objects • These scripts can be executed consecutively • Context is preserved between scripts • In the wild I‘ve seen splitting across 2-4 objects 29
  • 30. JavaScript splitting: Doing it better • One line of JavaScript per object • Randomize the order of JavaScript objects • Admittedly it takes only one script to sort and extract the scripts from the objects 30
  • 31. Anti-emulation code • Simple checks for Adobe Reader extensions • Multistaged JavaScript code 31
  • 32. Current malware loads code from Pages Annotations Info Dictionary 32
  • 33. Example: Loading code from annotations y = app.doc; y.syncAnnotScan(); var p = y["getAnnots"]({nPage: 0}); var s = p[0].subject; eval(s); 33
  • 34. Problems with current approaches Code is Easy to in the file extract 34
  • 35. Anti-emulation code: Improved Key ideas behind anti-emulation code Find idiosyncrasies in the Adobe JavaScript engine Find extensions that are difficult to emulate 35
  • 36. Exhibit A: Idiosyncrasy cypher = [7, 17, 28, 93, 4, 10, 4, 30, 7, 77, 83, 72]; cypherLength = cypher.length; hidden = "ThisIsNotTheKeyYouAreLookingFor"; hiddenLength = hidden.toString().length; for(i=0,j=0;i<cypherLength;i++,j++) { cypherChar = cypher[i]; keyChar = hidden.toString().charCodeAt(j); cypher[i] = String.fromCharCode(cypherChar ^ keyChar); if (j == hiddenLength - 1) j = -1; } eval(cypher.join("")); 36
  • 37. Exhibit A: Explained JavaScript Standard Adobe Reader JavaScript hidden = false; hidden = false; hidden = "Key"; hidden = "Key"; hidden has the value „Key“ hidden has the value „true“ 37
  • 38. Exhibit A: Explained The Adobe Reader JavaScript engine defines global variables that do not change their type on assignment. (I suspect this happens because they are backed by C++ code) 38
  • 39. Exhibit B: Difficult to emulate • Goal: Find Adobe JavaScript API functions which are nearly impossible to emulate • Then use effects of these functions in sneaky ways to change malware behavior • The Adobe Reader JavaScript documentation is your friend 39
  • 40. Exhibit B: Difficult to emulate Functions to look for Rendering engine Forms extensions Multimedia extensions 40
  • 41. Exhibit B: Difficult to emulate crypt = "T^_]^[T IEYYD__ FuRRKBD "; plain = Array(); key = getPageNthWordQuads(0, 0).toString().split(",")[1]; for (i=0,j=0;i<crypt.length;i++,j++) { plain = plain + String.fromCharCode((crypt.charCodeAt(i) ^ key.charCodeAt(j))); if (j >= key.length) j = 0; } app.alert(plain); ) 41
  • 42. Exhibit B: Difficult to emulate Functions to avoid Anything with security restrictions 42
  • 43. Exhibit C: Multi-threaded JavaScript • Multi-threaded applications are difficult to reverse engineer • Problem: There are no threads in JavaScript • Solution: setTimeOut • Example: Cooperative multi-threading with message-passing between objects 43
  • 44. Basic idea • Multiple server objects • String messages are passed between servers • Messages contain new timeout value and code to evaluate 44
  • 45. function Server(name) { ... } s1 = new Server("S1"); s2 = new Server("S2"); s1.receive(ENCODED_MESSAGE); 45
  • 46. function Server(name) { this.name = name; this.receive = function(message) { recipient = parse_recipient(message) delayTime = parse_delay(message) eval_string = parse_eval_string(message) msg_string = parse_message_string(message) eval(eval_string); command = "recipient.receive('" + msg_string + "')"; this.x = app.setTimeOut(command, delayTime); } }; 46
  • 47. How to improve this • Use a global string object as the message queue and manipulate the object on the fly • Usage of non-commutative operations so that execution order really matters • Message broadcasting • Add anti-emulation code to eval-ed code 47
  • 48. callee-trick • Not specific to Adobe Reader • Frequently used by JavaScript code in other contexts • Function accesses its own source and uses it as a key to decrypt code or data • Add a single whitespace and decryption fails 48
  • 49. callee-trick Example function decrypt(cypher) { var key = arguments.callee.toString(); for (var i = 0; i < cypher.length; i++) { plain = key.charCodeAt(i) ^ cypher.charCodeAt(i); } ... } 49
  • 50. More ideas for the future • Combine anti-debugging, callee-trick, and message passing • Find more JavaScript engine idiosyncracies: Sputnik JavaScript test suite 50
  • 51. Thanks • Didier Stevens • Julia Wolf • Peter Silberman • Bruce Dang 51
  • 52. 52

×