0
How to really obfuscate
  your PDF malware


   Sebastian Porst - ReCon 2010
Email: sebastian.porst@zynamics.com
        T...
Targeted Attacks 2008



                          Adobe Acrobat
  Microsoft Word;         Reader; 28.61%
      34.55%


 ...
Targeted Attacks 2009



                    Adobe Acrobat
  Microsoft Word;
                    Reader; 48.87%
      39.2...
Exploited in the wild

CVE-            CVE-            CVE-            CVE-
2007-           2009-           2009-         ...
Four common exploit paths

Broken PDF Parser

Vulnerable JavaScript Engine

Vulnerable external libraries

/Launch
       ...
PDF Malware Obfuscation
 Different tricks for different purposes

Make manual analysis more difficult

Resist automated an...
PDF Malware Obfuscation
         Conflicting goals


Avoid detection      Make analysis
   by being         difficult by b...
How to achieve these goals

  Being harmless         Being evil
• Avoid JavaScript       • Use heavy
• Do not use unusual ...
Let‘s be evil

                9
Breaking tools
Rule #1: Do the unexpected



                             11
This is what tools expect
• ASCII Strings
• Boring encodings like #41 instead of A
• Wellformed or only moderately malform...
Malformed documents
• Adobe Reader tries to load malformed PDF
  files
• Very, very liberal interpretation of the PDF
  sp...
Malformed PDF file – Example I
7 0 obj
  <<
   /Type /Action
   /S    /JavaScript
   /JS   (app.alert('whatever');)
  >>
e...
Malformed PDF file – Example II
5 0 obj
  << /Length 45 >>
  stream
    some data
  endstream
endobj




                 ...
Further reading




                  16
Obfuscating JavaScript code
Goal of JavaScript obfuscation




 Hide the shellcode

                                 18
JavaScript obfuscation in the wild
•   Screwed up formatting
•   Name obfuscation
•   Eval-chains
•   Splitting JavaScript...
Screwed up formatting
• Basically just remove all newlines
• Completely useless: jsbeautifier.org




                    ...
Name obfuscation
• Variables or function names are renamed to
  hide their meaning
• Most JavaScript obfuscators screw thi...
Obfuscation example: Original code
function executePayload(payload, delay)
{
  if (delay > 1000)
  {
    // Whatever
  }
}...
Obfuscation without considering scope
function executePayload(hkof3ewhoife, fhpfewhpofe)
{
  if (fhpfewhpofe > 1000)
  {
 ...
Obfuscation with considering scope
function executePayload(grtertttrr, hnpfefwefee)
{
  if (hnpfefwefee > 1000)
  {
    //...
Obfuscation: Going the whole way
function ____(____, _____)
{
  if (_____ > 1000)
  {
    // Whatever
  }
}

function ____...
Name obfuscation: Lessons learned
• Consider name scope
  – Deobfuscator needs to know scoping rules too
• Use underscores...
Eval chains
• JavaScript code can execute JavaScript code in
  strings through eval
• Often used to hide later code stages...
Eval chains: Doing it better
• Make sure your later stages reference
  variables or functions from earlier stages
• Re-use...
JavaScript splitting
• JavaScript can be split over several PDF
  objects
• These scripts can be executed consecutively
• ...
JavaScript splitting: Doing it better
• One line of JavaScript per object
• Randomize the order of JavaScript objects
• Ad...
Anti-emulation code
• Simple checks for Adobe Reader extensions
• Multistaged JavaScript code




                        ...
Current malware loads code from

Pages

Annotations

Info Dictionary
                                  32
Example: Loading code from
            annotations
y = app.doc;

y.syncAnnotScan();

var p = y["getAnnots"]({nPage: 0});

...
Problems with current approaches



    Code is         Easy to
   in the file      extract


                            ...
Anti-emulation code: Improved
 Key ideas behind anti-emulation code

Find idiosyncrasies in the
Adobe JavaScript engine

F...
Exhibit A: Idiosyncrasy
cypher = [7, 17, 28, 93, 4, 10, 4, 30, 7, 77, 83, 72];
cypherLength = cypher.length;

hidden = "Th...
Exhibit A: Explained

  JavaScript Standard         Adobe Reader JavaScript

  hidden = false;              hidden = false...
Exhibit A: Explained
The Adobe Reader JavaScript engine
defines global variables that do not
change their type on assignme...
Exhibit B: Difficult to emulate
• Goal: Find Adobe JavaScript API functions
  which are nearly impossible to emulate
• The...
Exhibit B: Difficult to emulate
       Functions to look for

Rendering engine

Forms extensions

Multimedia extensions
  ...
Exhibit B: Difficult to emulate
crypt = "T^_]^[T IEYYD__ FuRRKBD ";
plain = Array();
key = getPageNthWordQuads(0, 0).toStr...
Exhibit B: Difficult to emulate
       Functions to avoid


Anything with
security restrictions

                         ...
Exhibit C: Multi-threaded JavaScript
• Multi-threaded applications are difficult to
  reverse engineer
• Problem: There ar...
Basic idea
• Multiple server objects
• String messages are passed between servers
• Messages contain new timeout value and...
function Server(name)
{
  ...
}

s1 = new Server("S1");
s2 = new Server("S2");

s1.receive(ENCODED_MESSAGE);




         ...
function Server(name)
{
  this.name = name;

  this.receive = function(message)
  {
    recipient = parse_recipient(messag...
How to improve this
• Use a global string object as the message
  queue and manipulate the object on the fly
• Usage of no...
callee-trick
• Not specific to Adobe Reader
• Frequently used by JavaScript code in other
  contexts
• Function accesses i...
callee-trick Example
function decrypt(cypher)
{
  var key = arguments.callee.toString();

    for (var i = 0; i < cypher.l...
More ideas for the future
• Combine anti-debugging, callee-trick, and
  message passing
• Find more JavaScript engine idio...
Thanks
•   Didier Stevens
•   Julia Wolf
•   Peter Silberman
•   Bruce Dang




                               51
52
Upcoming SlideShare
Loading in...5
×

How to really obfuscate your pdf malware

2,444

Published on

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,444
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
61
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Transcript of "How to really obfuscate your pdf malware"

  1. 1. How to really obfuscate your PDF malware Sebastian Porst - ReCon 2010 Email: sebastian.porst@zynamics.com Twitter: @LambdaCube 1
  2. 2. Targeted Attacks 2008 Adobe Acrobat Microsoft Word; Reader; 28.61% 34.55% Microsoft PowerPoint; Microsoft Excel; 16.87% 19.97% http://www.f-secure.com/weblog/archives/00001676.html 2
  3. 3. Targeted Attacks 2009 Adobe Acrobat Microsoft Word; Reader; 48.87% 39.22% Microsoft Microsoft PowerPoint; Excel; 7.39% 4.52% 3
  4. 4. Exploited in the wild CVE- CVE- CVE- CVE- 2007- 2009- 2009- 2009- 5659 0658 1492 4324 CVE- CVE- CVE- CVE- 2008- 2009- 2009- 2010- 2992 0927 3459 0188
  5. 5. Four common exploit paths Broken PDF Parser Vulnerable JavaScript Engine Vulnerable external libraries /Launch 5
  6. 6. PDF Malware Obfuscation Different tricks for different purposes Make manual analysis more difficult Resist automated analysis Avoid detection by virus scanners 6
  7. 7. PDF Malware Obfuscation Conflicting goals Avoid detection Make analysis by being difficult by being wellformed malformed 7
  8. 8. How to achieve these goals Being harmless Being evil • Avoid JavaScript • Use heavy • Do not use unusual obfuscation encodings • Try to break tools • Do not try to break parser-based tools • Ideally use an 0-day 8
  9. 9. Let‘s be evil 9
  10. 10. Breaking tools
  11. 11. Rule #1: Do the unexpected 11
  12. 12. This is what tools expect • ASCII Strings • Boring encodings like #41 instead of A • Wellformed or only moderately malformed PDF file structure 12
  13. 13. Malformed documents • Adobe Reader tries to load malformed PDF files • Very, very liberal interpretation of the PDF specification • Parser-based analysis tools need to know about Adobe Reader file correction 13
  14. 14. Malformed PDF file – Example I 7 0 obj << /Type /Action /S /JavaScript /JS (app.alert('whatever');) >> endobj 14
  15. 15. Malformed PDF file – Example II 5 0 obj << /Length 45 >> stream some data endstream endobj 15
  16. 16. Further reading 16
  17. 17. Obfuscating JavaScript code
  18. 18. Goal of JavaScript obfuscation Hide the shellcode 18
  19. 19. JavaScript obfuscation in the wild • Screwed up formatting • Name obfuscation • Eval-chains • Splitting JavaScript code • Simple anti-emulation techniques • callee-trick • ... 19
  20. 20. Screwed up formatting • Basically just remove all newlines • Completely useless: jsbeautifier.org 20
  21. 21. Name obfuscation • Variables or function names are renamed to hide their meaning • Most JavaScript obfuscators screw this up 21
  22. 22. Obfuscation example: Original code function executePayload(payload, delay) { if (delay > 1000) { // Whatever } } function heapSpray(code, repeat) { for (i=0;i<repeat;i++) { code = code + code; } } 22
  23. 23. Obfuscation without considering scope function executePayload(hkof3ewhoife, fhpfewhpofe) { if (fhpfewhpofe > 1000) { // Whatever } } function heapSpray(hoprwehjoprew, hoifwep43) { for (jnpfw93=0;jnpfw93<hoifwep43;jnpfw93++) { hoprwehjoprew = hoprwehjoprew + hoprwehjoprew; } } 23
  24. 24. Obfuscation with considering scope function executePayload(grtertttrr, hnpfefwefee) { if (hnpfefwefee > 1000) { // Whatever } } function heapSpray(grtertttrr, hnpfefwefee) { for (hjnprew=0;hjnprew<hnpfefwefee;hjnprew++) { grtertttrr = grtertttrr + grtertttrr; } } 24
  25. 25. Obfuscation: Going the whole way function ____(____, _____) { if (_____ > 1000) { // Whatever } } function _____(____, _____) { for (______=0; ______<_____; ______++) { ____ = ____ + ____; } } 25
  26. 26. Name obfuscation: Lessons learned • Consider name scope – Deobfuscator needs to know scoping rules too • Use underscores – Drives human analysts crazy • Also cute: Use meaningful names that have nothing to do with the variable – Maybe shuffle real variable names 26
  27. 27. Eval chains • JavaScript code can execute JavaScript code in strings through eval • Often used to hide later code stages which are decrypted on the fly • Common way to extract argument: replace eval with a printing function 27
  28. 28. Eval chains: Doing it better • Make sure your later stages reference variables or functions from earlier stages • Re-use individual eval statements multiple times to make sure eval calls can not just be replaced 28
  29. 29. JavaScript splitting • JavaScript can be split over several PDF objects • These scripts can be executed consecutively • Context is preserved between scripts • In the wild I‘ve seen splitting across 2-4 objects 29
  30. 30. JavaScript splitting: Doing it better • One line of JavaScript per object • Randomize the order of JavaScript objects • Admittedly it takes only one script to sort and extract the scripts from the objects 30
  31. 31. Anti-emulation code • Simple checks for Adobe Reader extensions • Multistaged JavaScript code 31
  32. 32. Current malware loads code from Pages Annotations Info Dictionary 32
  33. 33. Example: Loading code from annotations y = app.doc; y.syncAnnotScan(); var p = y["getAnnots"]({nPage: 0}); var s = p[0].subject; eval(s); 33
  34. 34. Problems with current approaches Code is Easy to in the file extract 34
  35. 35. Anti-emulation code: Improved Key ideas behind anti-emulation code Find idiosyncrasies in the Adobe JavaScript engine Find extensions that are difficult to emulate 35
  36. 36. Exhibit A: Idiosyncrasy cypher = [7, 17, 28, 93, 4, 10, 4, 30, 7, 77, 83, 72]; cypherLength = cypher.length; hidden = "ThisIsNotTheKeyYouAreLookingFor"; hiddenLength = hidden.toString().length; for(i=0,j=0;i<cypherLength;i++,j++) { cypherChar = cypher[i]; keyChar = hidden.toString().charCodeAt(j); cypher[i] = String.fromCharCode(cypherChar ^ keyChar); if (j == hiddenLength - 1) j = -1; } eval(cypher.join("")); 36
  37. 37. Exhibit A: Explained JavaScript Standard Adobe Reader JavaScript hidden = false; hidden = false; hidden = "Key"; hidden = "Key"; hidden has the value „Key“ hidden has the value „true“ 37
  38. 38. Exhibit A: Explained The Adobe Reader JavaScript engine defines global variables that do not change their type on assignment. (I suspect this happens because they are backed by C++ code) 38
  39. 39. Exhibit B: Difficult to emulate • Goal: Find Adobe JavaScript API functions which are nearly impossible to emulate • Then use effects of these functions in sneaky ways to change malware behavior • The Adobe Reader JavaScript documentation is your friend 39
  40. 40. Exhibit B: Difficult to emulate Functions to look for Rendering engine Forms extensions Multimedia extensions 40
  41. 41. Exhibit B: Difficult to emulate crypt = "T^_]^[T IEYYD__ FuRRKBD "; plain = Array(); key = getPageNthWordQuads(0, 0).toString().split(",")[1]; for (i=0,j=0;i<crypt.length;i++,j++) { plain = plain + String.fromCharCode((crypt.charCodeAt(i) ^ key.charCodeAt(j))); if (j >= key.length) j = 0; } app.alert(plain); ) 41
  42. 42. Exhibit B: Difficult to emulate Functions to avoid Anything with security restrictions 42
  43. 43. Exhibit C: Multi-threaded JavaScript • Multi-threaded applications are difficult to reverse engineer • Problem: There are no threads in JavaScript • Solution: setTimeOut • Example: Cooperative multi-threading with message-passing between objects 43
  44. 44. Basic idea • Multiple server objects • String messages are passed between servers • Messages contain new timeout value and code to evaluate 44
  45. 45. function Server(name) { ... } s1 = new Server("S1"); s2 = new Server("S2"); s1.receive(ENCODED_MESSAGE); 45
  46. 46. function Server(name) { this.name = name; this.receive = function(message) { recipient = parse_recipient(message) delayTime = parse_delay(message) eval_string = parse_eval_string(message) msg_string = parse_message_string(message) eval(eval_string); command = "recipient.receive('" + msg_string + "')"; this.x = app.setTimeOut(command, delayTime); } }; 46
  47. 47. How to improve this • Use a global string object as the message queue and manipulate the object on the fly • Usage of non-commutative operations so that execution order really matters • Message broadcasting • Add anti-emulation code to eval-ed code 47
  48. 48. callee-trick • Not specific to Adobe Reader • Frequently used by JavaScript code in other contexts • Function accesses its own source and uses it as a key to decrypt code or data • Add a single whitespace and decryption fails 48
  49. 49. callee-trick Example function decrypt(cypher) { var key = arguments.callee.toString(); for (var i = 0; i < cypher.length; i++) { plain = key.charCodeAt(i) ^ cypher.charCodeAt(i); } ... } 49
  50. 50. More ideas for the future • Combine anti-debugging, callee-trick, and message passing • Find more JavaScript engine idiosyncracies: Sputnik JavaScript test suite 50
  51. 51. Thanks • Didier Stevens • Julia Wolf • Peter Silberman • Bruce Dang 51
  52. 52. 52
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×