Here we have placed the string “test\\n” in front of a valid 7zip file.
Given that the file doesn’t start with the 7zip start marker and instead begins with plaintext and a newline, the UNIX ‘file’ utility misinterprets it as a data file. p7zip, on the other hand, begins its interpretation of the file starting at the 7zip header. This results in the file still being a valid 7zip archive.
Here, while saving a GIF in GIMP, we write a PHP backdoor into a comment. This will be mostly ignored when parsing the file as an image, but as PHP only interprets code between its start and end markers “<?php” and “?>”, the image data will not affect the execution of the script.
The backdoor is written directly into the file.
Here is the combination PDF and 7zip file we’ve created, opened as a PDF.
Then, we change the file extension (though this actually should be unnecessary) and list the contents of the embedded 7zip archive.
This is a JPEG file. It looks ordinary and parses correctly.
When we interpret the same file as a RAR archive, we find that we have a valid archive, too! This RAR archive was simply appended to the end of our original JPEG. While it is possible to append a RAR to the end of a JPEG and get a file which opens as either format, it is not possible to append a JPEG to the end of a RAR and achieve the same results. This is due to the use of absolute offsets in the JPEG format which must be adjusted to point to the correct resources.
Before the fix was put in place, it was fairly commonplace to see book sharing threads on 4chan, where people appended rar files containing ebook versions of books to jpegs of book covers for the appropriate book. People could download the jpegs, change the extension to .rar, and get an ebook of the book mentioned.
Jack of all Formats Daniel “unicornFurnace” Crowley Penetration Tester, Trustwave - SpiderLabs
Introductions How can files be multiple formats? Why is this interesting from a security perspective? What can we do about it? (yodawg we heard you like files so we put files in your files)
Terms File piggybacking Placing one file into another File consumption Parsing a file and interpreting its contents
Scope of this talk Files which can be interpreted as multiple formats …with at most a change of file extension Covert channels Through use of piggybacking Examples are mostly Web-centric Only because it’s my specialty This concept applies to more than Web applications Srsly this applies to more than Web applications GUYS IT’S NOT JUST WEB APPS
Files with multiple formats How to piggyback files
File format flexibility Not always rigidly defined From the PDF specification:“This standard does not specify the following:……methods for validating the conformance of PDF files or readers…” Thank you Julia Wolf for “OMG WTF PDF” CSV comments exist but are not part of the standard Not all data in a file is parsed Metadata Unreferenced blocks of data Data outside start/end markers Reserved, unused fields
File format flexibility Some data can be interpreted multiple ways Method of file consumption often determined by: File extension Multiple file extensions may result in multiple parses Bytes at beginning of file First identified file header
Multiple file extensions Apache has: Languages Handlers MIME types File.en.php.png Basename– largely ignored File.en.php.png Language – US English File.en.php.png Triggers PHP handler File.en.php.png Triggers image/png MIME type
Metadata Information about the file itself Not always parsed by the file consumer “Comment”fields, few restrictions on data Files can be inserted into comment fields for one format ID3 tags for mp3 files will be shown in players But not usually interpreted
Unreferenced blocks of data Certain formats define resources with offsets and sizes Unmentioned parts of the file are ignored Other files can occupy unmentioned space Other formats indicate a total size of data to be parsed Any additional data is ignored Other files can simply be appended
Unreferenced PDF object PDF xref table, lists object offsets in the file We first remove one reference Next, we replace part of that object’s content…
Start/End markers Many formats use a magic byte sequence to denote the beginning of data Similarly, many have one to denote the end of data Data outside start/end markers is ignored Files can be placed before or after such markers Files must not contain conflicting markers
Start/End markers JPEG Start marker: 0xFFD8 End marker: 0xFFD9 RAR Start marker: 0x526172211A0700 PDF Start marker: %PDF End marker: n%%EOFn (r and rn can replace n) PHP Start marker: <?php End marker: ?>
Limitations Some formats use absolute offsets They must be placed at start of file or offsets must be adjusted Examples: JPEG, BMP, PDF Some have headers which indicate the size of each resource to follow Such files are usually easy to work with Other files can be appended without breaking things Examples: RAR
Limitations Some files are simply parsed from start to end Such files require some metadata, unreferenced space, or data which can be manipulated to have multiple meanings Different parsers for the same format operate differently Might implement different non-standard features May interpret format of files in different ways
TrueCrypt volumes No start/end markers No publicly known signature Parsed from start of file to end of file No metadata fields No unused space Data is difficult to manipulate
Security Implications Reasons why file piggybacking must be considered
Security Implications File upload pwnage Checking for well-formed images doesn’t prevent backdoor upload Anti-Virus evasion Some AV detect file format being scanned then apply format specific rules If file is multiple formats the wrong rules might be applied Data infiltration/exfiltration Do you care what .mp3 files pass in and out of your network? How about .exe and .doc files?
Security Implications Multiple file consumers Different programs may interpret the file in different ways GIFAR issue Parasitic storage How many file uploads allow only valid images? Disk space exhaustion DoS Some image uploads limit uploads by picture dimensions Size of the file may not actually be checked
File upload pwnage Imagine a Web-based image upload utility It confirms that the uploaded file is a valid JPEG It doesn’t check the file extension It uploads the file into the Web root It doesn’t set the permissions to disallow execution Code upload is possible if the file is also a valid JPEG This isn’t hard…
Anti-Virus evasion exercise Check detection rates on Win32 netcat Place it in an archive and check Put junk data at the beginning of the file and check Piggyback the archive onto the end of a JPEG and check Change the file extension to .JPG and check