• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Jack of all Formats

Jack of all Formats



In this presentation, I discuss four different approaches to merging multiple files of different formats into one, such that it can be read as each type. I then discuss the security implications of ...

In this presentation, I discuss four different approaches to merging multiple files of different formats into one, such that it can be read as each type. I then discuss the security implications of this property inherent in many file formats, theorize about attacks which can be launched when developers assume that files can only be one format.



Total Views
Views on SlideShare
Embed Views



5 Embeds 11

https://twitter.com 5
http://us-w1.rockmelt.com 3
http://twitter.com 1
http://www.linkedin.com 1
https://www.linkedin.com 1


Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment
  • Here we have placed the string “test\\n” in front of a valid 7zip file.
  • Given that the file doesn’t start with the 7zip start marker and instead begins with plaintext and a newline, the UNIX ‘file’ utility misinterprets it as a data file. p7zip, on the other hand, begins its interpretation of the file starting at the 7zip header. This results in the file still being a valid 7zip archive.
  • Here, while saving a GIF in GIMP, we write a PHP backdoor into a comment. This will be mostly ignored when parsing the file as an image, but as PHP only interprets code between its start and end markers “”, the image data will not affect the execution of the script.
  • The backdoor is written directly into the file.
  • Here is the combination PDF and 7zip file we’ve created, opened as a PDF.
  • Then, we change the file extension (though this actually should be unnecessary) and list the contents of the embedded 7zip archive.
  • This is a JPEG file. It looks ordinary and parses correctly.
  • When we interpret the same file as a RAR archive, we find that we have a valid archive, too! This RAR archive was simply appended to the end of our original JPEG. While it is possible to append a RAR to the end of a JPEG and get a file which opens as either format, it is not possible to append a JPEG to the end of a RAR and achieve the same results. This is due to the use of absolute offsets in the JPEG format which must be adjusted to point to the correct resources.
  • Before the fix was put in place, it was fairly commonplace to see book sharing threads on 4chan, where people appended rar files containing ebook versions of books to jpegs of book covers for the appropriate book. People could download the jpegs, change the extension to .rar, and get an ebook of the book mentioned.

Jack of all Formats Jack of all Formats Presentation Transcript

  • Jack of all Formats
    Daniel “unicornFurnace” Crowley
    Application Security Services, Trustwave - SpiderLabs
  • Introductions
    How can files be multiple formats?
    Why is this interesting from a security perspective?
    What can we do about it?
    (yodawg we heard you like files so we put files in your files)
  • Terms
    File piggybacking
    Placing one file into another
    File consumption
    Parsing a file and interpreting its contents
  • Scope of this talk
    Files which can be interpreted as multiple formats
    …with at most a change of file extension
    Covert channels
    Through use of piggybacking
    Examples are mostly Web-centric
    Only because it’s my specialty
    This concept applies to more than Web applications
    Srsly this applies to more than Web applications
  • Files with multiple formats
    How to piggyback files
    (Clap and cheer now to confuse the people who can’t read this)
  • File format flexibility
    Not always rigidly defined
    From the PDF specification:“This standard does not specify the following:……methods for validating the conformance of PDF files or readers…”
    Thank you Julia Wolf for “OMG WTF PDF”
    CSV comments exist but are not part of the standard
    Not all data in a file is parsed
    Unreferenced blocks of data
    Data outside start/end markers
    Reserved, unused fields
  • File format flexibility
    Some data can be interpreted multiple ways
    Method of file consumption often determined by:
    File extension
    Multiple file extensions may result in multiple parses
    Bytes at beginning of file
    First identified file header
  • 7zip file with junk data at the beginning
  • 7zip file with junk data at the beginning
  • Multiple file extensions
    Apache has:
    MIME types
    Basename– largely ignored
    Language – US English
    Triggers PHP handler
    Triggers image/png MIME type
  • Metadata
    Information about the file itself
    Not always parsed by the file consumer
    “Comment”fields, few restrictions on data
    Files can be inserted into comment fields for one format
    ID3 tags for mp3 files will be shown in players
    But not usually interpreted
  • Metadata – GIF comment
  • Metadata – GIF comment
  • Unreferenced blocks of data
    Certain formats define resources with offsets and sizes
    Unmentioned parts of the file are ignored
    Other files can occupy unmentioned space
    Other formats indicate a total size of data to be parsed
    Any additional data is ignored
    Other files can simply be appended
    Some formats indicate that unrecognized data is ignored
    • May still need to be formatted correctly
  • Unreferenced PDF object
    PDF xref table, lists object offsets in the file
    We first remove one reference
    Next, we replace part of that object’s content…
  • Unreferenced PDF object
    …with a 7zip file.
  • PDF / 7Z opened as a PDF
  • PDF / 7Z opened as a 7Z
  • PNG file format
    • Static signature
    • Series of chunks
    • IHDR chunk
    • Other chunks including at least one IDAT chunk
    • IEND chunk
  • PNG chunk format
    • 4 byte length field
    • 4 byte identification field
    • Data
    • 4 byte CRC of id field and data field
    Chunks with unknown IDs will be ignored
    The CRC will likely not even be checked
  • jaCK chunk
  • Start/End markers
    Many formats use a magic byte sequence to denote the beginning of data
    Similarly, many have one to denote the end of data
    Data outside start/end markers is ignored
    Files can be placed before or after such markers
    Files must not contain conflicting markers
  • Start/End markers
    Start marker: 0xFFD8
    End marker: 0xFFD9
    Start marker: 0x526172211A0700
    Start marker: %PDF
    End marker: n%%EOFn (r and rn can replace n)
    Start marker: <?php
    End marker: ?>
  • A WinRAR is you!
  • A WinRAR is also JPEG!
  • Limitations
    Some formats use absolute offsets
    They must be placed at start of file or offsets must be adjusted
    Examples: JPEG, BMP, PDF
    Some have headers which indicate the size of each resource to follow
    Such files are usually easy to work with
    Other files can be appended without breaking things
    Examples: RAR
  • Limitations
    Some files are simply parsed from start to end
    Such files require some metadata, unreferenced space, or data which can be manipulated to have multiple meanings
    Different parsers for the same format operate differently
    Might implement different non-standard features
    May interpret format of files in different ways
  • TrueCrypt volumes
    No start/end markers
    No publicly known signature
    Parsed from start of file to end of file
    No metadata fields
    No unused space
    Data is difficult to manipulate
  • TrueCrypt volumes
  • Security Implications
    Reasons why file piggybacking must be considered
    (Read the first word in every sub-bullet on the next slide)
  • Security Implications
    Data infiltration/exfiltration
    Never check what .mp3 files pass in and out of your network?
    Gonna change that when you get back to the office?
    Anti-Virus evasion
    Give an AV a piggybacked file, it might apply the wrong rules
    You might not know that most AV applies heuristics/signatures based on identified file format!
    File upload pwnage
    Up loading well-formed images that are also backdoors is possible
  • Security Implications
    Multiple file consumers
    Different programs may interpret the file in different ways
    GIFAR issue
    Parasitic storage
    How many file uploads allow only valid images?
    Disk space exhaustion DoS
    Some image uploads limit uploads by picture dimensions
    Size of the file may not actually be checked
  • File upload pwnage
    Imagine a Web-based image upload utility
    It confirms that the uploaded file is a valid JPEG
    It doesn’t check the file extension
    It uploads the file into the Web root
    It doesn’t set the permissions to disallow execution
    Code upload is possible if the file is also a valid JPEG
    This isn’t hard…
  • Anti-Virus evasion exercise
    Check detection rates on Win32 netcat
    Place it in an archive and check
    Put junk data at the beginning of the file and check
    Piggyback the archive onto the end of a JPEG and check
    Change the file extension to .JPG and check
  • Check detection rates on netcat
  • Archive netcat and check again
  • Add junk at the beginning of the file
  • Piggyback the archive onto a JPEG
  • Change the extension to .jpg
  • LULZ netkitties
  • Data Infiltration
    Take the previous example of a 7z attached to a JPEG
    This will bypass lots of AV
    Maybe also IDS/IPS
    Haven’t tested it
  • Data Exfiltration
    • DLP will generally look for:
    • Type of files being communicated
    • Content of traffic
    • Communication properties
    • These techniques allow for covert channels
    • With wide bandwidth
    • With some plausible deniability
    • In files which are
    • Ordinarily harmless
    • Frequently passed
    • Without breaking the piggybacked files’ usability
  • Parasitic storage
    • Certain sites allow for file upload of specific formats
    File piggybacking essentially removes this limitation
    • This technique has been used on 4chan (now fixed)
    Book sharing threads
    LOIC distribution
    CP distribution
    • Still works on certain image sites
    • Browsers automagically download images
    • What if those images are also malware?
    • Now all you need to do is figure out how to execute it…
  • Multiple File Consumers
    • GIFAR issue
    • JAR appended to the end of a GIF
    • Browser loads the GIF
    • Old versions of JVM would recognize AND RUN the JAR
    • Apache handling “file.en.php.png”
    • Passes file to PHP for preprocessing
    • Serves resulting output with
    • a US english charset
    • MIME type of “image/png”
  • Disk Space Exhaustion DoS
    • Imagine a file upload utility
    • It allows the upload of only 1x1 images
    • For disk space reasons
    • Append 2GB of junk onto the end of a 1x1 image
    • ???
    • NO DISK SPACE!!!
    • Checking properties of the file format may not be sufficient
  • Protections
    What can we do about this?
    (Not much)
  • File upload with code
    • Don’t upload in the Web root
    • Don’t allow the user to control any part of the filename
    • Don’t set the perms to executable
    • Don’t trust file properties
    • Allow only one extension
    • Allow only known good extensions
  • Anti-virus Evasion
    • We could:
    • Check for all valid file headers
    • Performance hit
    • Apply all signatures/heuristics globally
    • Big freakin’ performance hit
    • Identify by behavior
    • This doesn’t work on gateway AV
  • Disk Space Exhaustion
    • Don’t just check properties from the expected format
    • Nuffsaid
    • Put some additional protection in place
    • Disk quota
    • Separate partition for uploads
  • Parasitic storage
    • In metadata
    • Remove metadata
    • At end of file
    • Parse out relevant format data and save as new file
    • In unreferenced block or as part of real data
    • Don’t upload files?
    • Don’t allow unauthenticated file upload?
  • Questions?
    Daniel Crowley