Jack of all Formats<br />Daniel “unicornFurnace” Crowley<br />Application Security Services, Trustwave - SpiderLabs<br />
Introductions<br />How can files be multiple formats?<br />Why is this interesting from a security perspective?<br />What ...
Terms<br />File piggybacking<br />Placing one file into another<br />File consumption<br />Parsing a file and interpreting...
Scope of this talk<br />Files which can be interpreted as multiple formats<br />…with at most a change of file extension<b...
Files with multiple formats<br />How to piggyback files<br />(Clap and cheer now to confuse the people who can’t read this...
File format flexibility<br />Not always rigidly defined<br />From the PDF specification:“This standard does not specify th...
File format flexibility<br />Some data can be interpreted multiple ways<br />Method of file consumption often determined b...
7zip file with junk data at the beginning<br />
7zip file with junk data at the beginning<br />
Multiple file extensions<br />Apache has:<br />Languages<br />Handlers<br />MIME types<br />File.en.php.png<br />Basename–...
Metadata<br />Information about the file itself<br />Not always parsed by the file consumer<br />“Comment”fields, few rest...
Metadata – GIF comment<br />
Metadata – GIF comment<br />
Unreferenced blocks of data<br />Certain formats define resources with offsets and sizes<br />Unmentioned parts of the fil...
Unreferenced PDF object<br />…with a 7zip file.<br />
PDF / 7Z opened as a PDF<br />
PDF / 7Z opened as a 7Z<br />
PNG file format<br /><ul><li>Static signature
Series of chunks
IHDR chunk
Other chunks including at least one IDAT chunk
IEND chunk</li></li></ul><li>PNG chunk format<br /><ul><li>4 byte length field
4 byte identification field
Data
4 byte CRC of id field and data field</li></ul>Chunks with unknown IDs will be ignored<br />The CRC will likely not even b...
jaCK chunk<br />
Start/End markers<br />Many formats use a magic byte sequence to denote the beginning of data<br />Similarly, many have on...
Start/End markers<br />JPEG<br />Start marker: 0xFFD8<br />End marker: 0xFFD9<br />RAR<br />Start marker: 0x526172211A0700...
A WinRAR is you!<br />
A WinRAR is also JPEG!<br />
Limitations<br />Some formats use absolute offsets<br />They must be placed at start of file or offsets must be adjusted<b...
Limitations<br />Some files are simply parsed from start to end<br />Such files require some metadata, unreferenced space,...
TrueCrypt volumes<br />No start/end markers<br />No publicly known signature<br />Parsed from start of file to end of file...
TrueCrypt volumes<br />
Security Implications<br />Reasons why file piggybacking must be considered<br />(Read the first word in every sub-bullet ...
Security Implications<br />Data infiltration/exfiltration<br />Never check what .mp3 files pass in and out of your network...
Security Implications<br />Multiple file consumers<br />Different programs may interpret the file in different ways<br />G...
File upload pwnage<br />Imagine a Web-based image upload utility<br />It confirms that the uploaded file is a valid JPEG<b...
Anti-Virus evasion exercise<br />Check detection rates on Win32 netcat<br />Place it in an archive and check<br />Put junk...
Check detection rates on netcat<br />
Archive netcat and check again<br />
Add junk at the beginning of the file<br />
Piggyback the archive onto a JPEG<br />
Change the extension to .jpg<br />
LULZ netkitties<br />
Data Infiltration<br />Take the previous example of a 7z attached to a JPEG<br />This will bypass lots of AV<br />Maybe al...
Data Exfiltration<br /><ul><li>DLP will generally look for:
Type of files being communicated
Content of traffic
Communication properties
These techniques allow for covert channels
Upcoming SlideShare
Loading in …5
×

Jack of all Formats

1,732 views

Published on

In this presentation, I discuss four different approaches to merging multiple files of different formats into one, such that it can be read as each type. I then discuss the security implications of this property inherent in many file formats, theorize about attacks which can be launched when developers assume that files can only be one format.

Published in: Technology, Art & Photos
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,732
On SlideShare
0
From Embeds
0
Number of Embeds
26
Actions
Shares
0
Downloads
12
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Here we have placed the string “test\\n” in front of a valid 7zip file.
  • Given that the file doesn’t start with the 7zip start marker and instead begins with plaintext and a newline, the UNIX ‘file’ utility misinterprets it as a data file. p7zip, on the other hand, begins its interpretation of the file starting at the 7zip header. This results in the file still being a valid 7zip archive.
  • Here, while saving a GIF in GIMP, we write a PHP backdoor into a comment. This will be mostly ignored when parsing the file as an image, but as PHP only interprets code between its start and end markers “&lt;?php” and “?&gt;”, the image data will not affect the execution of the script.
  • The backdoor is written directly into the file.
  • Here is the combination PDF and 7zip file we’ve created, opened as a PDF.
  • Then, we change the file extension (though this actually should be unnecessary) and list the contents of the embedded 7zip archive.
  • This is a JPEG file. It looks ordinary and parses correctly.
  • When we interpret the same file as a RAR archive, we find that we have a valid archive, too! This RAR archive was simply appended to the end of our original JPEG. While it is possible to append a RAR to the end of a JPEG and get a file which opens as either format, it is not possible to append a JPEG to the end of a RAR and achieve the same results. This is due to the use of absolute offsets in the JPEG format which must be adjusted to point to the correct resources.
  • Before the fix was put in place, it was fairly commonplace to see book sharing threads on 4chan, where people appended rar files containing ebook versions of books to jpegs of book covers for the appropriate book. People could download the jpegs, change the extension to .rar, and get an ebook of the book mentioned.
  • Jack of all Formats

    1. 1. Jack of all Formats<br />Daniel “unicornFurnace” Crowley<br />Application Security Services, Trustwave - SpiderLabs<br />
    2. 2. Introductions<br />How can files be multiple formats?<br />Why is this interesting from a security perspective?<br />What can we do about it?<br />(yodawg we heard you like files so we put files in your files)<br />
    3. 3. Terms<br />File piggybacking<br />Placing one file into another<br />File consumption<br />Parsing a file and interpreting its contents<br />
    4. 4. Scope of this talk<br />Files which can be interpreted as multiple formats<br />…with at most a change of file extension<br />Covert channels<br />Through use of piggybacking<br />Examples are mostly Web-centric<br />Only because it’s my specialty<br />This concept applies to more than Web applications<br />Srsly this applies to more than Web applications<br />GUYS IT’S NOT JUST WEB APPS<br />
    5. 5. Files with multiple formats<br />How to piggyback files<br />(Clap and cheer now to confuse the people who can’t read this)<br />
    6. 6. File format flexibility<br />Not always rigidly defined<br />From the PDF specification:“This standard does not specify the following:……methods for validating the conformance of PDF files or readers…”<br />Thank you Julia Wolf for “OMG WTF PDF”<br />CSV comments exist but are not part of the standard<br />Not all data in a file is parsed<br />Metadata<br />Unreferenced blocks of data<br />Data outside start/end markers<br />Reserved, unused fields<br />
    7. 7. File format flexibility<br />Some data can be interpreted multiple ways<br />Method of file consumption often determined by:<br />File extension<br />Multiple file extensions may result in multiple parses<br />Bytes at beginning of file<br />First identified file header<br />
    8. 8. 7zip file with junk data at the beginning<br />
    9. 9. 7zip file with junk data at the beginning<br />
    10. 10. Multiple file extensions<br />Apache has:<br />Languages<br />Handlers<br />MIME types<br />File.en.php.png<br />Basename– largely ignored<br />File.en.php.png<br />Language – US English<br />File.en.php.png<br />Triggers PHP handler<br />File.en.php.png<br />Triggers image/png MIME type<br />
    11. 11. Metadata<br />Information about the file itself<br />Not always parsed by the file consumer<br />“Comment”fields, few restrictions on data<br />Files can be inserted into comment fields for one format<br />ID3 tags for mp3 files will be shown in players<br />But not usually interpreted<br />
    12. 12. Metadata – GIF comment<br />
    13. 13. Metadata – GIF comment<br />
    14. 14. Unreferenced blocks of data<br />Certain formats define resources with offsets and sizes<br />Unmentioned parts of the file are ignored<br />Other files can occupy unmentioned space<br />Other formats indicate a total size of data to be parsed<br />Any additional data is ignored<br />Other files can simply be appended<br />Some formats indicate that unrecognized data is ignored<br /><ul><li>May still need to be formatted correctly</li></li></ul><li>Unreferenced PDF object<br />PDF xref table, lists object offsets in the file<br />We first remove one reference<br />Next, we replace part of that object’s content…<br />
    15. 15. Unreferenced PDF object<br />…with a 7zip file.<br />
    16. 16. PDF / 7Z opened as a PDF<br />
    17. 17. PDF / 7Z opened as a 7Z<br />
    18. 18. PNG file format<br /><ul><li>Static signature
    19. 19. Series of chunks
    20. 20. IHDR chunk
    21. 21. Other chunks including at least one IDAT chunk
    22. 22. IEND chunk</li></li></ul><li>PNG chunk format<br /><ul><li>4 byte length field
    23. 23. 4 byte identification field
    24. 24. Data
    25. 25. 4 byte CRC of id field and data field</li></ul>Chunks with unknown IDs will be ignored<br />The CRC will likely not even be checked<br />
    26. 26. jaCK chunk<br />
    27. 27. Start/End markers<br />Many formats use a magic byte sequence to denote the beginning of data<br />Similarly, many have one to denote the end of data<br />Data outside start/end markers is ignored<br />Files can be placed before or after such markers<br />Files must not contain conflicting markers<br />
    28. 28. Start/End markers<br />JPEG<br />Start marker: 0xFFD8<br />End marker: 0xFFD9<br />RAR<br />Start marker: 0x526172211A0700<br />PDF<br />Start marker: %PDF<br />End marker: n%%EOFn (r and rn can replace n)<br />PHP<br />Start marker: <?php<br />End marker: ?><br />
    29. 29. A WinRAR is you!<br />
    30. 30. A WinRAR is also JPEG!<br />
    31. 31. Limitations<br />Some formats use absolute offsets<br />They must be placed at start of file or offsets must be adjusted<br />Examples: JPEG, BMP, PDF<br />Some have headers which indicate the size of each resource to follow<br />Such files are usually easy to work with<br />Other files can be appended without breaking things<br />Examples: RAR<br />
    32. 32. Limitations<br />Some files are simply parsed from start to end<br />Such files require some metadata, unreferenced space, or data which can be manipulated to have multiple meanings<br />Different parsers for the same format operate differently<br />Might implement different non-standard features<br />May interpret format of files in different ways<br />
    33. 33. TrueCrypt volumes<br />No start/end markers<br />No publicly known signature<br />Parsed from start of file to end of file<br />No metadata fields<br />No unused space<br />Data is difficult to manipulate<br />
    34. 34. TrueCrypt volumes<br />
    35. 35. Security Implications<br />Reasons why file piggybacking must be considered<br />(Read the first word in every sub-bullet on the next slide)<br />
    36. 36. Security Implications<br />Data infiltration/exfiltration<br />Never check what .mp3 files pass in and out of your network?<br />Gonna change that when you get back to the office?<br />Anti-Virus evasion<br />Give an AV a piggybacked file, it might apply the wrong rules<br />You might not know that most AV applies heuristics/signatures based on identified file format!<br />File upload pwnage<br />Up loading well-formed images that are also backdoors is possible<br />
    37. 37. Security Implications<br />Multiple file consumers<br />Different programs may interpret the file in different ways<br />GIFAR issue<br />Parasitic storage<br />How many file uploads allow only valid images?<br />Disk space exhaustion DoS<br />Some image uploads limit uploads by picture dimensions<br />Size of the file may not actually be checked<br />
    38. 38. File upload pwnage<br />Imagine a Web-based image upload utility<br />It confirms that the uploaded file is a valid JPEG<br />It doesn’t check the file extension<br />It uploads the file into the Web root<br />It doesn’t set the permissions to disallow execution<br />Code upload is possible if the file is also a valid JPEG<br />This isn’t hard…<br />
    39. 39. Anti-Virus evasion exercise<br />Check detection rates on Win32 netcat<br />Place it in an archive and check<br />Put junk data at the beginning of the file and check<br />Piggyback the archive onto the end of a JPEG and check<br />Change the file extension to .JPG and check<br />
    40. 40. Check detection rates on netcat<br />
    41. 41. Archive netcat and check again<br />
    42. 42. Add junk at the beginning of the file<br />
    43. 43. Piggyback the archive onto a JPEG<br />
    44. 44. Change the extension to .jpg<br />
    45. 45. LULZ netkitties<br />
    46. 46. Data Infiltration<br />Take the previous example of a 7z attached to a JPEG<br />This will bypass lots of AV<br />Maybe also IDS/IPS<br />Haven’t tested it<br />
    47. 47. Data Exfiltration<br /><ul><li>DLP will generally look for:
    48. 48. Type of files being communicated
    49. 49. Content of traffic
    50. 50. Communication properties
    51. 51. These techniques allow for covert channels
    52. 52. With wide bandwidth
    53. 53. With some plausible deniability
    54. 54. In files which are
    55. 55. Ordinarily harmless
    56. 56. Frequently passed
    57. 57. Without breaking the piggybacked files’ usability</li></li></ul><li>Parasitic storage<br /><ul><li>Certain sites allow for file upload of specific formats</li></ul>File piggybacking essentially removes this limitation<br /><ul><li>This technique has been used on 4chan (now fixed)</li></ul>Book sharing threads<br />LOIC distribution<br />CP distribution<br /><ul><li>Still works on certain image sites
    58. 58. Browsers automagically download images
    59. 59. What if those images are also malware?
    60. 60. Now all you need to do is figure out how to execute it…</li></li></ul><li>Multiple File Consumers<br /><ul><li>GIFAR issue
    61. 61. JAR appended to the end of a GIF
    62. 62. Browser loads the GIF
    63. 63. Old versions of JVM would recognize AND RUN the JAR
    64. 64. Apache handling “file.en.php.png”
    65. 65. Passes file to PHP for preprocessing
    66. 66. Serves resulting output with
    67. 67. a US english charset
    68. 68. MIME type of “image/png”</li></li></ul><li>Disk Space Exhaustion DoS<br /><ul><li>Imagine a file upload utility
    69. 69. It allows the upload of only 1x1 images
    70. 70. For disk space reasons
    71. 71. Append 2GB of junk onto the end of a 1x1 image
    72. 72. ???
    73. 73. NO DISK SPACE!!!
    74. 74. Checking properties of the file format may not be sufficient</li></li></ul><li>Protections<br />What can we do about this?<br />(Not much)<br />
    75. 75. File upload with code<br /><ul><li>Don’t upload in the Web root
    76. 76. Don’t allow the user to control any part of the filename
    77. 77. Don’t set the perms to executable
    78. 78. Don’t trust file properties
    79. 79. Allow only one extension
    80. 80. Allow only known good extensions</li></li></ul><li>Anti-virus Evasion<br /><ul><li>We could:
    81. 81. Check for all valid file headers
    82. 82. Performance hit
    83. 83. Apply all signatures/heuristics globally
    84. 84. Big freakin’ performance hit
    85. 85. Identify by behavior
    86. 86. This doesn’t work on gateway AV</li></li></ul><li>Disk Space Exhaustion<br /><ul><li>Don’t just check properties from the expected format
    87. 87. Nuffsaid
    88. 88. Put some additional protection in place
    89. 89. Disk quota
    90. 90. Separate partition for uploads</li></li></ul><li>Parasitic storage<br /><ul><li>In metadata
    91. 91. Remove metadata
    92. 92. At end of file
    93. 93. Parse out relevant format data and save as new file
    94. 94. In unreferenced block or as part of real data
    95. 95. Don’t upload files?
    96. 96. Don’t allow unauthenticated file upload?</li></li></ul><li>Questions?<br />Daniel Crowley<br />Dcrowley@Trustwave.com<br />@dan_crowley<br />

    ×